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SOME FACTORS IN THE PROGRAMING OF 
CONCEPTUAL LEARNING! 


ROBERT M 


GAGNE anp LARRY T 


BROWN 


Princeton University 


Studies of conceptually mediated 
behavior in human beings are often 
concerned with the effects of varia- 
tions in previously established con- 
cepts on behavior in the solution of 
problems. This is true, for example, 
of the work of Maier (1930) with 
practical problems; of Luchins (1942) 
on water-jar problems; and of Maltz- 
man and his collaborators (Malzman, 
Eisman, Brooks, & Smith, 1956; 
Maltzman & Morrisett, 1952, 1953a 
1953b) on anagrams; to mention a few 
well-known investigations. Usually, 
in such studies, the concepts employed 
by S are assumed to have a history of 
previous establishment, and are con- 
sidered to be available to S at the 
time the problem is set. In a different 
category can be placed research on 
the learning (or formation) of con- 
cepts, such as that of Hull (1920), 
Smoke (1932), Reed (1950), and Heid- 
breder (1947), which need 
reviewed here. The concepts ac- 
quired in the course of the experiment 
are usually not further “‘used,” as in 
the solution of a problem, but are 


not be 


1 The research reported in this paper was 
supported in part by a grant from the Car- 
negie Corporation of New York. 


simply measured as “estab- 
lished” in the sense that they meet 
a criterion of learning or recall. 

It would appear worthwhile to 
devote some effort to the experimental 
study of events which bridge these 
two processes of concept learning and 
utilization ; in other words, to examine 
the question of how concepts (or 
concept sequences) which are newly 
learned subsequently enter into the 
activity of solving problems. There 
is the related question, too, of the 
extent to which observations of prob- 
lem solving performance may throw 
light upon the effectiveness of learning 
of the concepts employed. This is 
the general framework of the present 
study. 

Although there are a number of 
isolated studies of the learning of 
principles for problem solving, per- 
haps the best known intensive work 
on this subject is that of Katona 
(1940). His investigations demon- 
strate the relative ineffectiveness of 
“memorizing” and of “verbal prin- 
ciple learning’’ in the solution of 
card-trick and match-stick pattern 
problems, as compared with what 
Katona refers to as “understanding.”’ 


being 
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A number of Katona’s findings have 
been confirmed and amplified by 
other investigators (Hilgard, Edgren, 
& Irvine, 1954; Hilgard, Irvine, & 
Whipple, 1953). According to Ka- 
tona, the learning of ‘‘senseless con- 
nections’ may be contrasted in its 
effectiveness for the solution of new 
problems with the learning of ‘“‘mean- 
ingful organization."’ When learning 
partakes of the apprehension of mean- 
ingful wholes, he considers it to be 
the sort which favors transfer to prob- 
lem solving situations. In 
experiments, the greatest 
in solving card-trick problems was 
achieved by Ss who listened to an 
explanation of a basic problem and 
watched a step-by-step demonstration 
on the part of E. In the case of these 
materials, this method was found to 
be the most effective, followed in 
order by (a) learning a verbally stated 
principle; (6) memorizing the steps 
in the procedure; and (c) no training. 

Of considerable interest to those 
who want to study conceptual learn- 
ing has been the recent development 
of teaching machines and the pro- 
graming of learning materials for such 
machines (see Galanter, 1959). In 
general, descriptions of these develop- 
ments have made it clear that the 
intended use, at least, of teaching 
machines and their related procedures 
is for the establishment of useful 
concepts (often called ‘‘verbal be- 
havior’’), as opposed to the memoriza- 
tion of rote materials. It would also 
appear that investigators of 
learning programing believe that pro- 
cedures can be developed which will 
convey the kind of ‘‘understanding”’ 
that least in 
terms of their effectiveness in estab- 
lishing whatever it takes to solve 
problems. However, from an exam- 
ination of representative published 
examples of programs (e.g., Holland, 


Katona’s 


success 


most 


Katona describes, at 
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1959: Skinner, 
mediately 
conveying 


1958) it is not im- 
apparent that they are 
“understanding” in the 
sense of capability for inducing trans- 
fer to new problem situations. They 
appear to be concerned primarily with 
the usage of words in a variety of 
stimulus contexts (e.g., the word 
“incandescent” in Skinner’s 1958 ex- 
ample). It seems possible, therefore, 
for such programing of conceptual 
material to induce the learning of 
“verbally-stated principles” which Ka- 
tona considers from his evidence to 
be less than optimal for problem 
solution. Whether this is actually the 
case or not cannot be told from 
present evidence. The very impor- 
tant reason for this is that no in- 
vestigation of learning programing 
has used a measure providing evidence 
of transfer of the learning to per- 
formance in a problem situation. 
Specifically, the purpose of the 
present study was to obtain a measure 


of learning effectiveness of programed 
material, in terms which would permit 


a reasonable inference that ‘‘under- 
standing” had accomplished, 
and to relate this measure to certain 
characteristics of the programing. 
The materials pertained to number 
series, and it was intended that Ss 
would learn to state and use formulas 
for the sum of any number of terms 
in such series. In measuring effec- 
was used which re- 
quired transfer of learned principles 
to a novel problem solving situation. 
The characteristics of the program 
which were varied may be described 
as follows: (a) step-wise vs. abrupt 
presentation (an extreme of “‘large 
steps’”’ in a program); (6) encourage- 
ment of “discovery” of principles vs. 
“verbal statement” of principles. It 
may be said parenthetically that the 
relatively inadequate specificity with 
which such variations in stimulus 


been 


tiveness, a test 
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materials can presently be described 
is fully recognized. It was hoped that 
the present experiment would point 
the way to further refinement of such 
description in research to follow. 


METHOD 
Apparatus 


For the presentation of learning materials, 
a simple form of a teaching machine 
employed. Cards (4 X 6 in.) containing the 
items to be responded to were placed in indi- 
vidual plastic cases of a visible card file. 
Each file contained 25 The fiber- 
board backing of the card file was then clipped 
to a board mounted on 
about 40°, which was placed on a table in 
front of S. The correct response to each item 
was printed on the back of the card. The S 
read the item, wrote his response on an answer 
sheet numbered to correspond with the item 
numbers, and then flipped the card down- 
ward towards him to check the answer shown 
on the back of the card. When 25 items were 
completed in this way, E simply replaced 
the visible card file with i 
subsequent items in sequence 


was 


items 


a stand inclined at 


another « ontaining 


Materials 


The learning programs which were devised 
to teach principles pertaining to number series 
Introductory program, used 
and three programs incorporating 


consisted of an 
by all Ss; 
differences in experimental treatment, which 
were designated Rule and Example (R&E), 
Discovery (D), and Guided Discovery (GD). 
The Introductory program contained 89 
items, beginning with one which read: ‘Here 
is a series of numbers: 1 3 5 7 9 11 13 
What are the next two numbers in this series? 
" Its intention was to establish learn- 
ing of basic concepts, in a definitional sense, 
that would be used in later learning and 
problem solving. Specifically, these concepts 
were term value (any individual term in the 
and its symbol, 7; (the 
position of a term in a series) and its symbol, 
n; and other derivatives of these 
n+1, Ty-1, [he program 
constructed of small steps, in accordance with 
principles described by Skinner (1958, 1959) 
and various other investigators (cf. Galanter, 
1959). It was revised after initial tryout with 
a few Ss, and seems to have achieved a fairly 
form. (The number of 
errors made by boys of Grades 9 and 10 over 
the 89 items was 12.3.) 


series), term number 


such as 


=**1, etc. was 


successful average 
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Each of the experimental treatment pro 
grams began with the item: ‘Here is another 
Series: 1 2 4 8 16 
want to obtain a formula for =" 
First, can you fill in the next two term values? 

” Each then provided, in the different 
ways to be described, practice in using the 
basic concepts previously learned in connec- 
tion with the following four number series: 


series. Now we 


for this series 


1 
1 


(4 i5 31 


In each case, the series was ultimately dis- 
played in three rows which identified the 
term number, the series, and the sum of terms 
In addition, arrows were used to direct atten- 
tion to the column in which the terms to be 
employed in the formula were to be found 
The following is an example: 


Term No ; 1 2 3 
Series: 1 


. 


The R&E program, following one or two 
introductory for each series, 
began by stating the correct formula in each 
case, which S was instructed to write down 
on his answer sheet. 

1 the formula stated 


items number 


For example, for Series 
at 
was 2* = 7,,; — 1 


Following this, items progressed in small 
steps through a 
required the identification of 
formula, and the finding of numerical values 


number of examples which 
terms in the 
for 2" by the use of the formula. This 
program contained Items 90 — 125, or a total 
of 36. 

The D program began with the same intro- 
ductory items, then abruptly stated ‘What 
is the rule for 2*, in terms of any term number 
(n) or term value (TJ), or both? Try to do 
this by yourself. If you need a hint, turn to 
the next card. If you know the answer, 
TELL ME.” The following cards contained 
hints which further directed the attention of 
S to certain relationships between a specific 
sum such as 7, and a term value such as 8 
by outlining these in red ink. 


Questions con- 
card gave 
progressively more complete guidance, with- 
out, however, stating the answer. It 
necessary for S to show his answer to E, since 
there are several correct formulas for each 
series. As a whole, then, the D program had 


tained on each successive “‘hint”’ 


was 
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the following sequence: Number Series 

3 items, 2 hints; Number Series 

3 hints; Number Series 3: 1 iter 

Number Series 4: 
The GD with the 


introductory items as the others. 


1 item, 3 hints 


program began same 
Following 
this, an item stated: ‘‘What do you have to do 


» obta 


idding, subtracting, dividing, 


1 the sum, or 2? Can you get =" by 
or multiplying 
pointed to? For 
bottom row), the 
term-value pointed to is 4 If you know 4, 
you can get 3 by . 


the numbers in the row 


2 


example, when =" is 3 


Successive items gave further examples of this 
relationship, in a small-step fashion, until an 
item was reached which read: ‘‘Are you ready 
to state a general rule? Try it in symbols, 
letting 7,4; mean the term-value of the n+ Ist 
term.”” Then followed several examples which 
required the use of the formula in finding 
Altogether 
numbered 


specific numerical values for 2’ 
this program contained 40 items, 
from 90 to 129. 

For all three groups, the same problem 
solving test was used to measure performance. 
rhis test comprised four problems, adminis- 
tered individually, each of which involved 
a number series with which Ss had no previous 
Each of the series was shown 
in a center row labeled ‘‘Series,”’ along with 
an upper row of digits labeled ‘““Term No.” 
and a lower row labeled “2.” In each case, 
the instructions called for S to find the formula 
for =" 
these: 


acquaintance. 


[he series employed in the test were 


Problem 1: 
Problem 2: 
Problem 3: 


Problem 4: 


The problems were also accompanied by a 


set of hints to be displayed successively in a 
visible card file, 
described. 


conditions to be 
These were constructed similarly, 
to those hints used in the D program 


under 


Subjects 

The Ss were boys in Grades 9 and 10, 
ranging in chronological age from 14 to 16 yr 
Ww ho had no prey ious spec ific knowledge about 
number series. Nearly all these boys had 
either completed or were completing a first 
- a few were studying general 
mathematics which does not include algebra. 
Their grades in Grade 9 mathematics covered 
a range from D+ to A. Equivalent 10 
scores on the Henmon-Nelson intelligence 
test ranged from 100 to 154, indicating a very 
considerable spread of basic academic ability. 


course in algebra 
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Eleven Ss were assigned at random to each of 
the three experimental groups, for a total of 
33 The administration of the program was 
the case of 2 additional Ss 
who made more than 55 errors on the Intro- 


discontinued in 


ductory program, and were therefore assumed 
to be quantitatively not comparable to the 
other Ss. 


Procedure 


In broad outline, the procedure began with 
the giving of instructions and the administra- 
tion of the Introductory program of 89 items 
\fter a 3-min. break, S 
random to one of the three experimental 
groups, and proceeded to complete one of the 
three learning programs. He returned the 
following day at the same hour, and went 
through the program 
Introductory 
was 


was assigned at 


same (excluding the 

once again. \ break 
and S received instructions for 
the problem solving test, which he then took. 

Iniroductory Instructions were 
given to the effect that we wanted S to learn 
some things about 


gai 
3-min. 
given, 


program 


He Was 
then told how to respond to the cards and 
how to manipulate them. 
incorrect 


number series. 


In the case of an 
back the 
cross out his answer, and 
write the correct one, before going on The 


answer, he was to turn 


card, read it again, 


E kept a record of errors and recorded time 
to complete the 89 items. Very infrequently, 
he answered questions by repeating relevant 
parts of the instructions. 

Learning program.—The items of one of 
the three learning programs were administered 
in the usual way, following the 3-min. rest. 
Again, E recorded errors and time for com- 
pletion. With the D program, he also re- 
corded the number of hints used by S. The 
procedure employed with the three learning 
programs insured that, although the Ss 
responded to different sets of learning ma- 
terials, they all achieved successful answers 
to questions about the same four number series 
lo make doubly sure that this was so, each 
S went through the same learning program 
a second time on the day following his first 
session. In this respect, it may be said that 
all Ss had roughly the same degree of previous 
experience with number series at the time of 
the test. 


Problem solving test After a 3-min. rest, 
instructions were given for the test These 
stated that the problems would be given one 
at a time, and 10 min. allowed for completion 
of each problem. The S was told to keep 
track of time by glancing at an electric timer 
placed nearby. He was instructed to try to 


solve the problem (finding a formula for 2 


nm) 
~ 
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At the end of 5 min., however, E 
would say “‘Look at Hint Number 1,” at 
which time S was required to turn a card 
exposing a hint. He could use further hints 
if he felt he needed to, without additional 
instructions from E 

The test was designed to obtain maximal 
measurement potential from a limited number 


by himself. 


there are 
not many which can be treated in the manner 
covered by 
kept a 


problem (a 


of number series available (since 


the learning program The E 


record of: (a) time to solve each 
problem not 
recorded as requiring the maximum time of 
10 min.); (6 
(c) incorrect answers 

ure yielded few usable 


employed in the 


( ompleted was 


number of hints employed; and 
Che last-named meas- 
scores, and was not 
analysis of results. Since 
each series can yield more than one correct 
formula, S was instructed to show his an- 
swers to E, who would tell him whether they 
were correct The time at which S showed 
noted and recorded if the 
answer was correct; otherwise the 
kept S told to keep trying. 
In case the problem was not solved at the 
end of 10 min 
discouragement of S by 


an answer was 
timer was 
running and 
, efforts were made to prevent 
pointing to the 
correct formula, 


numerals involved in the 


without, however, expressing these in words 


or symbols As for correct answers, verbal 
expressions (using the phrases ‘‘term num- 
ber,” “term value,” et were accepted in 
(using n, 7 
etc.), although these were not frequent. 


Scoring performance 


place of symbolic expressions 
The available meas- 
ures of performance on the problem solving 
test were time taken to solve, and number of 
hints employed. In 
proficiency measure was constructed which 
utilized both kinds of information, namely, 
"a weighted 


addition to these, a 


using the formula: 


+- 4(Hint 1) 
1(Hint 4 


time score, 
Weighted Time Score = Min 
+ 3(Hint 2) + 2(Hint 3) 4+ 
Subject to the possible distortion arising from 
combining experimental groups, the split-half 
reliability of this score was determined to be 
0.72 
RESULTS 
Introductory program.—Time to com- 
plete the introductory program ranged 
from 26.8 to 53.9 min., and errors on 


these 89 items from 2 to 37. Using a 


score of time and errors, a comparison 
was 


made among those assigned to 
The 
Cond. R&E, 


13: Cond. D, M 49, 


the three expt rimental groups. 
results were as follows: 


M = 51, SD 


LEARNING 


rABLE 1 


ans, SDs, AND MEAN DIFFERENCES OF 
Time, Errors, AND HINTS DURING 
FIRST AND SECOND LEARNING 
SESSIONS, FOR EACH 
CONDITION 


ist Learning 
Mean 
SD 

2nd Learning 
Mean 
SD 


Difference 
in Means 13 


SD=12: Cond. GD, M 
So far as this initial 


measure is concerned, the groups may 


53, SD=10 
perform Lnce 


be assumed to be ¢ omparable. 

Table 1. pro- 
means and SDs for 
the time taken to complete learning 
during the first and second (relearn- 


Learning program 
vides values of 


these 
R&E 
hints re- 


Errors during 
sessions are given for Cond. 
and GD, and number of 
quired for Cond. D. 
among the groups in learning scores 
simply reveal differences in length 
(or difficulty) of the three learning 
programs. It 


ing) sessions. 


Comparisons 


is of some interest to 
note that Cond. D requires relatively 
the both the 
first 
The two other learning methods art 
more nearly comparable to each other 
in this with GD 
showing greater time and error scores. 


shortest time, during 


and second learning § sessions. 


respect, Cond. 

The main comparison to be made 
is between the scores for the first and 
the second learning sessions, within 
The data of Table 1 


reductions in both time and 


each condition. 
indicate 
errors when these sessions are com- 
The differences for errors and 


(Cond. D) are not 


pared. 


for hints signifi- 
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cant. Differences in time 
between the two sessions are signifi- 
cant beyond the .01 level (¢ test) for 
all three groups. Using the time 
scores as measures of learning leads 
to the conclusion, therefore, that sig- 
nificant learning gains took place 
with all three of the experimental 
learning methods. 

Problem solving performance.—The 
results of administering the test con- 
taining four new number-series prob- 
lems are contained in Table 2, showing 
means and SDs for the three measures 
used. Analyses of variance performed 
on these data yielded the following 
values: for time scores, F = 6.77, 
P <.01; for number of hints, F= 12.58, 
P < 1; for weighted time scores, 


F = 13.44, P < .01. In the case of 


scores 


time scores, Bartlett’s test indicates 
homogeneous variance; for the other 
two measures, however, the variances 
are significantly heterogeneous. Tests 
of ¢ between individual means yielded 


the following values for the time 
scores: Cond. R&E vs. Cond. D, 
t = 5.29, P < .01; Cond. R&E vs. 
Cond. GD, t = 9.26, P < .01; Cond. 
D vs. Cond. GD, t = 2.48, P < .02. 
For number of hints, the values for 
these comparisons in the same order 
were ¢t = 8.10, P < 1; 12 9.31, 
P < 01:2 = 1.21, P > .20. For the 
weighted time scores, these values 


TABLE 2 
MEANS AND SDs OF THREE PERFORMANCE 
MEASURES ON THE FINAL NUMBER- 
SERIES TASKS, FOR EACH 
ConpDITION 
Measure 


Time 
(Min.) 


No. of 
Hints 


Cond. Weighted 


Time Score 


SD VU SD 
.O | 46.8 | 16.8 
S baa 7%. 
.O | 28.4 7. 


S 
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were ¢= 7.61, P < .01; = 9.96, 
P < Ol: andt = 2.36, P < .05. 

In sum, both the time scores and 
the weighted time scores show sig- 
nificant differences (.05 level) between 
means for the individual conditions. 
In the case of number of hints, the 
means of Cond. D and GD do not 
differ significantly, but both differ 
from the mean of Cond. R&E. The 
most effective condition of the three 
appears to be GD; R&E is the least 
effective, and Dsomewhere in between. 


DISCUSSION 


Several things are shown by the results 
of this experiment. First of all, it is 
clear that the construction of learning 
programs by the use of small steps and 
the manipulation of other, similar vari- 
ables which obliquely characterize con- 
tent, can lead to programs which differ 
markedly in effectiveness. By the latter 
term we mean here the transfer of what 
is learned to a problem solving situation. 
In this study, both the R&E program 
and the GD program were designed to 
provide the kind of learning situation 
which Skinner (1958) and others have 
described. (It is recognized, however, 
that a further reduction in errors would 
be possible by redesign of some items.) 
The results obtained by comparison of 
the first and second (relearning) sessions 
give evidence that a significant amount 
of learning occurred in both programs. 
Just as obviously, learning also occurred 
in the D program, which may in one 
sense be characterized as having used 
“‘very large steps.’’ Although of course 
no direct comparison is possible because 
of the differences in these materials, it 
would apparently be difficult to make a 
choice among these programs, insofar as 
amount of learning is concerned. All 
of them show significant gains in per- 
formance from the first to the second 
learning session. They are all com- 
parable in the sense that Ss ‘“‘worked 
through”’ the same four number series. 

When a final performance in finding 
formulas for four new number series is 


measured, however, the picture is a 
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With such a transfer 
measure, the performances of the three 
The 


steps are 


very different one. 


groups split apart quite markedly. 


programs containing small 
which differ 


effectiveness in terms of 


most in their 
this 
whereas the large step program (D) falls 
The 


some 


the ones 
measure, 
in between. suggestion is irre- 
that what 
been learned in this situation is of greater 
effect than how it has been learned. 

The finding that a Discovery method 
leads to greater transfer than does a Rule 
and Example method is quite consistent 
with the findings of Katona (1940) as 
well as other investigators (Haslerud & 
Meyers, 1958; Hilgard et al., 1954; 
Hilgard et al., 1953). Katona’s work 
itself suggests that the individuals who 
‘“‘memorize”’ 


sistible aspect ol has 


solutions to his problems 
were simply not learning the right things 
Left to discover solutions, they somehow 
were able to acquire the “knowledge” 
needed to 
problems. 

pect of the 


unfamiliar 
noteworthy as 


solve new and 
However, a 
findings is that it 


was possible to guess at what some of 


present 


these right things might be, and so to 
construct a Guided Discovery 
which more 
of the other two. 
One of the limitations of this study, 
certainly, is that it 
exactly these 


program 


was effective than either 


does not tell us 
‘right things”’ are. 
It does not seem possible, at the present 
state of our knowledge, to design an 
experiment in this field which possesses 


what 


the relatively exact degree of stimulus 


definition characteristic, say, of experi- 


ments 
known 


nonsense syllables with 
Neverthe- 
less, it may be possible to draw some 
inferences concerning differences between 
the R&E and GD programs, which will 
point to the kinds of variables that could 
be studied in 
follow-up experiments. 


using 


association values. 


more exact fashion in 

What were the crucial differences be- 
tween the R&E and GD programs? 
Both provided for step-by-step learning 
in connection with the same four 


ber set ies, 


num- 
Both provided examples in 


which formulas for the sum of terms 


had to be used with specific numerical 


values. The difference between them 
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appears to reside in the occurrence of 
the type of item described for the GD 
program Method) which does not 
occur in the R&E program. Examina- 
tion of this item, together with one or 
two subsequent items of a similar nature, 
that they required the use of 
previously learned concepts in a new 
stimulus context. 


(see 


shows 


In responding to such 
items, S presumably had to actively pro- 
duce such concepts as term number, n, 
term value, T, 7,41, which he had just 
previously learned in the Introductory 
program, as well as some concepts learned 
even earlier, such as subtract, divide, mul- 
tiply, and add. 
specifically required S 


That is to say, the items 
to respond by 
writing, or otherwise reinstating, these 
verbal responses. In contrast, it may 
be noted that the R&E program did not 
require this; instead, entities like term 
number, 7, Ty; occurred as stimuli (in 
the formula provided) to which the 
required responses were the locating 
and copying of specific numerals. 
Presumably, the D_ program 
required the use (i.e., supplying by S 
himself) of previously learned concepts, 
in a manner similar to GD. We cannot, 
of course, identify these specifically, as 
we can in the individual 
GD program. In other words, we 


also 


steps ol the 
can 
not be sure that they were systematically 
practiced as responses in arriving at the 
formulas required as answers to the four 
number series. 
while 
The suggestion is, 


Some of them probably 
perhaps 
therefore, 


were, were not 

that the 
GD program led to a superior perform- 
ance because it required 
reinstatement of learned concepts, where- 
as the D program permitted such rein- 
statement to be random or 
chance affair. We do not consider that 
our experiment has proved this, but 
rather believe it is the kind of inference 
which may reasonably be made. 

It is noteworthy that the differences 
produced by the GD and the R&E 
programs resulted from differences in a 
relatively few items (four for each num- 
ber series of the 


some 


systematic 


more of a 


learning program). 
There is consequently a suggested rela 
tion between these findings and those of 


Maltzman and his associates (Maltzman 
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et al., 1956; Maltzman & Morrisett, 
1952, 1953a, 1953b) who have found 
that brief verbal instructions and brief 
periods of training may bring about 
significant differences in performance in 
the solution of anagram problems. In 
terms of Maltzman’s (1955) theoretical 
formulation, it might be argued that 
a few crucial items in the GD program 
have served to raise the habit strength 
of certain connections, thus heightening 
the strength of essential habit family 
hierarchies of the 
viously (i.e., those 


mentioned 
containing such 
responses as 7,41, m, as well as subtract, 
etc.). The ideas of Saugstad (1955, 
1957) regarding the necessity for avail- 
ability of concepts in problem solving 
appear also to be related. In our experi- 
ment, it seems likely that the GD 
program had the effect of increasing 
the availability of the concepts term 
value, 7, n+ 1, etc., as well as sub- 
tract, divide, etc., as we have previously 
noted. However, the present results do 
not make it possible to decide whether 
the reinstatement of these concepts 
accomplished the strengthening of con- 
nections, or whether it perhaps aroused 
some previously acquired organization. 


sort pre- 


In a more general vein, it is certainly 
not a new idea that positive transfer of 
training to a problem solving situation 
should depend upon the resemblance 
between what is practiced in a learning 
situation and what is used in problem 
solving. Surely our experiment provides 
another demonstration of this effect; as 
we have stated, it appears that concepts 
used in finding formulas in the final per- 


formance test were most systematically 


practiced in the GD program. But this 
general rule is not so completely obvious 
when one tries to apply it to the develop- 
ment of an effective learning program. 
For example, it might be said that the 
D program, in which Ss were required 
to find formulas for new number series, 
was most like the final situation, in 
which the finding of formulas for new 
Yet 
this was not the most effective program, 
Thus the 
that the 


number series was again required. 


in terms of positive transfer. 


present findings emphasize 


ROBERT M. GAGNE AND LARRY T. 


BROWN 


sources of positive transfer are to be 
sought in the mediators of behavior, 
rather than in the behavior itself. To 
find the ways of developing effective 
learning programs, one must search out 
the specific concepts (verbal responses, 
if one prefers such language) which enter 
into the chain of events between the 
stimulus situation and the overt per- 
formance itself. One can probably do 
this on an empirical basis; however, it 
would seem desirable to attempt a more 
generalizable result, by continuing the 
effort to obtain increasingly exact defini- 
tions of ‘‘what is learned”’ in the direc- 
tion suggested by this study. 

We that Katona’s (1940) 
emphasis on “understanding”’ is correct 


consider 
insofar as it emphasizes the necessity 
for measuring learning effectiveness in 
terms of transfer. to a problem solving 
situation. Such a method was employed 
in this study. our results 
emphasize the importance of ‘‘what is 
learned”’ rather than ‘‘how it is learned” 
as the crucial factor in learning effective- 
Discovery as a method appears 
to gain its effectiveness from the fact 
that it requires the individual learner 
to reinstate (and in this sense, to prac- 
will later use in 
problems. To the extent 
that the GD program was able to identify 
these concepts, it could then provide 
systematic practice in their use, and thus 
lead to a performance superior to that 
attained 


However, 


ness. 


tice) the concepts he 
solving new 


otherwise. The practice pro- 
vided by the R&E program, in contrast, 
did not require the use of these essential 
concepts (although it permitted it). Ac- 
cordingly, it led to a distinctly inferior 
problem solving performance. 


SUMMARY 


An experiment was performed to investi- 
gate the effects of certain variations in the 
programing of conceptual learning materials 
on effectiveness of learning as measured by 
performance in a problem solving situation. 
The programs were designed to foster learning 
of concepts to be used in the deriving of 
formulas for the sum of terms in unfamiliar 
number series. 

Thirty-three boys in Grades 9 and 10 were 
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assigned randomly to three groups of 11 each, 
and were tested individually. After first 
learning basic concepts pertaining to number 
series in a “‘small step’’ Introductory program 
of 89 items, the groups respectively went 
through three different learning programs 
called Rule and Example (R&E), Guided 
Discovery (GD), and Discovery (D). Ina 
second session the following day, relearning 
of the program was followed by a task con- 
taining four problems, each requiring S to 
find the formula for the sum of m terms in a 
number series. Performance was measured 
in terms of time to solve, number of hints 
required, and a weighted time score combining 
these. 

The results show significant learning gains 
(P < .01) between Learning Sessions 1 and 2 
under Final performance 
scores indicated best performance for Cond 
GD, worst for Cond. R&E, intermediate for 
Cond. D. Differences in these measures were 
tested by analyses of variance yielding P <.01 
in all cases. Variances were found to be 
homogeneous for time scores, not for the other 
two measures. For all measures, the means 
for Cond. R&E differed significantly from 
means of the other conditions, when tested 
individually (P <.01). For time 
comparisons of all individual means showed 
significant differences (P <.02). 


each condition 


scores, 


Discussion of the results emphasizes the 
importance of “what is learned’’ as opposed 
to “how it is learned’’ for problem solving 


performance. The Guided Discovery pro- 
gram, using small steps, requires S to rein- 
state (i.e., actively produce) certain concepts, 
a feature which may be lacking in the Rule 
and Example program. The 
program, containing “large may 
produce such reinstatement in a less systematic 
manner. 


Discovery 


steps” 
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RESPONSE! 


JOHN T. LANZETTA 


Center for Research on Social Behavior, 
University of Delaware 


Several earlier studies (Kanareff & 
Lanzetta, 1958, 1960; Lanzetta & 
Kanareff, 1959) have demonstrated 
the attenuating effects of negative 
sanctions for imitation, induced by 
instructional variations, on the acqui- 
sition of an imitative response. Sub- 
jects, when faced with a choice be- 
tween employing a “disapproved” 
imitative response which has a high 
probability of leading to a correct 
decision or utilizing a_ positively 
sanctioned response of independence 
or opposition which has a low prob- 
ability of leading to a_ successful 
choice, appear to compromise: the 
proportion of imitation responses eli- 
cited is less than expected either on 
the assumption that Ss attempt to 
maximize their frequency of success 
or that they “event match” (Estes, 
1957). 


The use of instructions to manipu- 
late expectations of approval or dis- 


approval, however, did not allow 
evaluation or control of the intensity 
of the applied sanctions and conse- 
quently did not provide a test of the 
relative effects of social and task 
feedback on the acquisition of an 
imitative response. In an effort to 
provide more continuous and precise 

1 This research was supported by Grant 
NSF-G 11427 of the National Science Founda- 
tion and Grant M-3909 from the National 
Institute of Mental Health, United States 
Public Health Service. The writers wish to 
express their appreciation to Jo Anne Davis 
for her assistance in collecting and analyzing 
the data. 
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Tulane University 


control a recent study (Kanareff & 
Lanzetta, 1960) employed a verbal 
response from E£ as a social reinforcer 
in conjunction with an independently 
varied task reinforcement (indication 
Ver- 
bal reinforcers have been shown to be 
effective in the conditioning of varied 
responses (e.g., Buss, Gerjuoy, & 
Zusman, 1954; Greenspoon, 1955) 
but they proved singularly ineffective 
in this study. Several possible rea- 
sons for the poor conditioning were 
recognized, some of which provided 
the rationale for the present pro- 
cedure. For one, E had stressed the 
importance of being correct and S 
was provided a visual signal as an 
indicator of performance adequacy. 
Thus, E’s verbal comments may have 
evoked minimal attention. Secondly, 
the verbal statements ‘“‘okay”’ and 
“good’’ may not have had sharply 
discriminable social 
reactions. In view of this it appeared 
desirable to have the social reinforce- 
ment independent of E as well as 
independent of the task reinforcement 
and also to utilize as socially rein- 
forcing stimuli cues which would be 
more sharply discriminable. One 
such reinforcement would be a co- 
worker’s emotional response to S’s 
behavior on each trial. 

The present study employs the 
“galvanic skin response’ (GSR) pur- 
portedly from a partner as a social 
reinforcer, in conjunction with a task 
reinforcement. The Ss were given 


of correctness or incorrectness). 


“meaning” as 
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the task of predicting whether a red 
or green light would be illuminated. 
On each trial they were presented 
with two cues, an indication of cor- 
rectness of their choice and an indica- 
tion of the reaction of their partner. 
The E controlled both cues and there- 
fore was able to control their con- 
gruence or incongruence, e.g., imita- 
tion of the partner could lead to a 
correct judgment but a _ negative 
emotional reaction from the partner. 
It was anticipated that congruence of 
social and task feedback would result 
in a higher level of imitation than 
when the reinforcements were in 
conflict. More specifically, for both 
task reinforcement conditions, it was 
predicted that the level of imitation 
would be a positive monotonic func- 
tion of the probability of a positive 
social response from the “partner” 
for imitation. 


METHOD 


Apparatus and procedure The Ss were 
tested in pairs, but were seated in separate 
enclosed booths. Each booth contained a 
box on which were two response 
keys and two pairs of red and green signal 
lights mounted one above the other. The 
task was to predict whether the red or green 
light of the top pair of lights would be illumi- 
nated. The S indicated his prediction by 
pressing the appropriately labeled response 
key. Each S was led to believe that his 
partner's indicated by the lights 
above the keys, given first. 
The E was thus able to simulate the partner 
by presenting a 
“‘partner’s”’ 


response 


choice, 
response was 

programed 

each §S 


sequence of 


choices to The correct- 


ness of S’s choice was indicated by the top 
row of lights on the response box. 


In addition to the response box, there was 
placed in each booth a “galvanometer”’ which 
Ss were told would measure GSR reactions. 
Two electrodes purportedly connected to a 
galvanometer in the “‘partner’s’’ booth were 
attached to S’s left palm in order to allow 
the “‘partner’”’ to see S’s GSR. The galva- 
nometer in S’s booth was similarly to inform 
S of the “partner's” emotional 
The galvanometers were disguised dc volt- 
meters which were controlled by E. The 


responses, 
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‘“‘partners’’’ emotional response 
gramed by E according to the 
social reinforcement in effect.? 


was pro- 
schedule of 


The sequence of task events is as follows: 
(a) A buzzer signal indicates the start of a 
trial. (b) Within 3 sec. E selects the “‘part- 
ner’s’”’ choice and it 
response boxes. (c) This serves as a signal 
for both Ss to make a choice. (d) Two 
seconds after Ss respond the GSR reading 
is displayed. It remains displayed for 4 sec., 
until the red or green light is 
illuminated. (e) The signal re 
mains displayed for 3 sec. One second after 
it terminates the buzzer is sounded starting 
a new trial. 


is displayed on Ss’ 


“outs ome” 


“outcome 


Instructions.—The following paraphrased 
portions of the instructions indicate the 
rationale provided to Ss: 


Most of you are probably familiar with 
gambling and the kinds of 
guesses that people make when gambling 
We're primarily interested in seeing how 
autonomic 


phenomena 


that is, emotional 
reactions are related to the types of guesses 
people make. You will be asked to make 
a choice as to whether the red or green light 
on this box will be illuminated. 
your choice by pressing the appropriate 
button on the box in front of you. The 
apparatus is constructed so as to auto- 
matically produce a sequence of red and 
green lights for you to predict. 

The particular autonomic function we'll 
be measuring today is palmar sweating. 
In the past our Ss have shown a great deal 
of curiosity about this response so we tried 
to arrange the equipment to allow you to 
watch the GSR readings. Since the guesses 
you make might be affected if you see your 
own reactions, we can't let you watch your 
own GSR. However, we can let you watch 
your partner’s reactions so that you can 
get some feeling for the way emotional 
tension changes over time 

We have a pair of electrodes and a galva- 
nometer. Changes in skin resistance are 
picked up by the electrodes and 
mitted to the galvanometer. As you can 
see, when the pointer falls here on the scale 
the emotional state is positive 


reactions, 


Indicate 


trans- 


¥ the person 
feels pleased about what goes on; when it 
falls in this the emotional 


area, state is 

2 We are indebted to members of the staff 
of the Communications Social Science Re- 
search Department of Bell Telephone Labora- 
tories for suggesting this experimental tech- 
nique. 
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negative, he feels displeased about what 
goes on. The galvanometer will be con- 
nected for readings as soon as both of you 
make your predictions. The pointer will 
rest in the center position until both of you 
make your predictions, it will then move 
to the right or left of center showing your 
partner's reaction to what went on. 


Subjects and design.—The three probabili- 

ties of social reinforcement used were .50 for 
imitation, and 1.00 for opposition. On the 
1.00 the “partner’s’’ emotional 
response was positive (pleased) for all imita- 
tion (or opposition) responses, i.e., for all 
choices identical (or opposite) with the part- 
ner’s. On the .50 schedule the “partner’s”’ 
emotional response was positive on 50% of 
the trials and negative on the other half, the 
order of positive and 
being randomized. 

The two probabilities of task reinforce- 
ment used with each schedule of social rein- 
forcement were .80 and .50. On the .80 
schedule the “‘partner’s’’ choice was called 
correct on 80°% of the trials, i.e., imitation 
of the was instrumental to being 
correct. On the .50 schedule the partner’s 
choice was correct on one-half of the trials. 

[he three probabilities of social rein- 
forcement and two probabilities of task rein- 
forcement formed a 3 X 2 factorial design. 
Forty-eight male and 24 female college stu- 
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dent volunteers were randomly assigned in 
like-sex pairs to the six cells with the restric- 
tion that there be 8 male and 4 female Ss in 
each cell. 

The dependent variable was the number 


of imitation responses per 10-trial block. 
There were 100 trials. 


RESULTS 


Experiment 1.—Figure 1 and Table 
1 indicate the major results: (a) The 
task reinforcement significantly af- 
fected the rate and level of acquisition 
of imitation responses. (b) The level 
of social reinforcement had little 
effect on the frequency of imitation 
for either of the task reinforcement 
conditions. 

Interviews at the termination of 
the indicate that Ss were 
generally aware of the partner’s emo- 
tional response throughout the session. 

For the present population of Ss 
and present conditions, it appears 
that the efficiency of a response for 
achieving success is more important 
in determining its rate of utilization 


session 
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Fie. 1. 


Mean number of imitative responses per block of trials as a function of 


probability of task reinforcement and probability of social reinforcement. 
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rABLE 1 


ANALYSIS OF VARIANCE OF IMITATIVI 


RESPONSES 


Source 


MS 


Task reinforcement 
Social reinforcement 
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a) 


‘Ss within X T”’ is used as the I 
s, “Ss within,” and all interactions involving 
s within” is used for all other terms. 


than a partner’s “emotional reaction” 
to the response. It is possible that 
the social motive, approval by others, 
is weak relative to the aroused 
achievement motive, and Ss regard 


the partner’s response as essentially 
“noisy” information. 
two further experimental possibilities : 


This suggests 


eliminating the task reinforcement 
(indication of correctness) might in- 
crease the efficacy of the social rein- 
forcement ; and increasing the strength 
of the motivation for approval might 
increase the attributable 
social reinforcement. Two brief ex- 
ploratory studies were performed to 
check on these suggestions. 
Experiment la.—The first point was 
examined by simply eliminating the 
task reinforcement. 


effects to 


The procedure 
was identical to that reported above 
except that the correct answer was 
not given to Ss after each trial. At 
the end of 100 trials they were told 
how many correct predictions they 


had made. There were 16 males, run 
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in pairs, 8 under each of the two 
schedules of social reinforcement, 1.00 
for imitation, 1.00 for opposition 
Figure 2 indicates that simple elimina- 
tion of the task reinforcement did not 
appreciably enhance the effects of 
the social reinforcement. In general, 
the level of imitation is higher when 
imitation reinforced but 
the difference is not significant. 

Experiment 1b.—TYhe second 
required some method for 
ening the motivation for approval. 
This was accomplished by informing 
Ss that on the basis of their perform- 
in the present certain 
Ss would be chosen for subsequent 
studies on decision making for which 
they would be paid. They were 
informed that two criteria would be 
important in selection, their accuracy 
in prediction and their compatibility. 
To assess the latter they would be 
asked to complete a compatibility 
rating scale on their partner at the 
end of the In all 
respects the method was identical 
to that used in the above described 
study. 


is socially 


test 


strength- 


ance session 


session. other 


There were six males under 
each of the two schedules of 
reinforcement, 1.00 
1.00 for opposition. 


social 


for imitation, 
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BLOCKS OF TEN TRIALS 


Fic. 3. Mean number of imitative re- 
sponses per block of trials as a function of 
probability of social reinforcement: No task 
feedback ; compatibility rating. 


Figure 3 indicates that the intro- 
duction of the compatibility evalua- 
tion did result in a marked effect 
attributable to the social reinforce- 
ment. The Ss imitated when imita- 
tion was approved of by the “‘partner”’ 
and opposed when opposition was 
approved by the “partner.” The 
mean difference between the condi- 


tions is significant (F=8.61, df =1/10, 


P < .05). It appears that Ss, our 
college Ss at least, are not terribly 
concerned about social approval un- 
less it is instrumental to some more 
important goal. 

The results of a further test of this 
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BLOCKS OF TEN TRIALS 


Fic. 4. Mean number of imitative re- 
sponses per block of trials as a function of 
probability of social reinforcement: No task 
feedback ; compatibility rating. 
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same possibility, using 20 Ss per 
condition, are shown in Fig. 4. The 
difference between the two conditions 
is significant (F = 8.49, df = 1/36, 
P < .01). It should be noted, how- 
ever, that the effects of a 1.00 schedule 
of social reinforcement appear to be 
much less than the effects of a .80 
schedule of task reinforcement (com- 
pare .80 conditions in Fig. 1 and 4). 


DISCUSSION 


The results clearly confirm our earlier 
findings on the effects of different sched- 
ules of task reinforcement on the acquisi- 
tion of an imitative response: the level 
of imitation is a positive function of the 
probability of task reinforcement for 
imitation. However, as in an earlier 
study using verbal feedback from E 
(Kanareff & Lanzetta, 1960), social feed- 
back did not appreciably affect the 
probability of an imitative response 
under either of the task feedback 
conditions. 

The inefficacy of the GSR as a social 
reinforcer may be attributable to many 
factors, of which the following appear 
most plausible: 

1. When two cues are available, one 
of which is specified as relevant to the 
primary task of performing effectively, 
Ss will attend to the cue specified as 
relevant. Operant conditioning studies, 
especially those utilizing a verbal rein- 
forcer, typically provide cues which are 
unspecified; they may be interpreted 
either as task or social feedback. In the 
present study Ss receive task informative 
feedback as well as independent social 
feedback which is known to be irrelevant 
to task performance. Terrell and Ken- 
nedy (1957) have demonstrated that 
when one of two sets of cues is labeled 
as task informative while the alternative 
set is left unspecified, the specified set 
predominates. Our results may be re- 
flecting the operation of such a selective 
reaction to labeled features of the 
environment. 

2. Subjects may not have been moti- 
vated to elicit positive reactions from the 
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In view of the results for the 
5 task reinforcement condition and the 
data obtained under ‘‘no task feedback” 
conditions the labeling of cues per se 


partner. 


postulated above does not appear to be 
the critical More important 
may be the cues which define the primary 
objective of performance. The Ss may 
define their task to be that of attaining 
a high number of correct responses and 
only cues which are specified as relevant 
to this task are attended to. When the 
specified cues are invalid or when no 
cues are specified as task informative 
Ss may attempt to utilize other non- 
labeled cues which are available but 
they will not employ cues which have 
been specified as nonrelevant by E£. 
Implicit in such an analysis is the as- 
sumption that the experimental condi- 
tions are highly effective in arousing 
strong achievement motivation; so strong 
indeed that when no cues are 
provided for guiding behavior toward 
achievement goals Ss do not respond to 
which allow them to 
presumed secondary goal of ‘‘pleasing 
the partner.’”” The results obtained 
when a more explicit effort is made to 
arouse or heighten the motivation to 
‘please the partner’ lends some support 
to the assumption that affiliation mo- 
tives were weak relative to achievement 
motives in the major study. 

3. Subjects did not ‘‘perceive”’ the 
contingency between the partner’s GSR 
responses and their own behavior but 
rather assumed that the partner was 
responding to his own success or failure. 
This is not to imply that a “‘perceived”’ 
contingency is necessary or sufficient 
for conditioning; there is some evidence 
that conditioning can be achieved even 
when Ss do not report being aware of a 
contingency between response and rein- 
forcing cue (e.g., Dailey, 1953; Essman, 
1959) although Levin (1961) has recently 
challenged these findings. Rather it is 
assumed that a set to associate a cue 
with an event over which one has no 
control will attenuate the effectiveness 
of the cue as a reinforcer. 

Unfortunately, no systematic proce- 
dure was employed to determine whether 


variable. 


even 


cues achieve a 


Ss did in fact perceive a relationship 
between their own responses and part 
ner’s GSR. Some Ss spontaneously 
volunteered information which indicated 


that such an association was often made. 
The same remarks, however, force one 
to question the assumption that Ss, if 
concerned about the partner's reaction 
at all, are motivated to elicit a positive 
One S spoke with glee of 
his ability to negatively arouse his partner. 


"GSR response. 


The two exploratory studies do not 
efficiently rule out any of these possible 
explanations of the inefficacy of the 
social reinforcement when paired with a 
task reinforcement. Elimination of the 
task informative feedback did not en- 
hance the effects of the social feedback, 
which casts some doubt on the viability 
of Explanation 1, but does not help in 
discriminating between 2 and 3. Intro- 
duction of the additional criteria of 
“compatibility’’ did enhance the effects 
of the social reinforcer but the induction 
was not evaluated under conditions where 
task and social feedback are both present. 
In addition, the increased conditioning 
with this induction may be attributed 
to either heightened motivation to please 
the partner per se (Explanation 2) or 
to an increase in observational responses 
of the GSR cue which would increase 
the likelihood of Ss perceiving a con- 
tingency between their own behavior 
and the partner's GSR (Explanation 3). 
Further studies are obviously necessary 
in order to select among these alternative 
possibilities. 


SUMMARY 


The present study examined the effects 
of a social reinforcement (emotional response 
of a partner) either congruent or in conflict 
with a task reinforcement (indication of cor- 
rectness) on the frequency of utilization of an 
imitative response in a two-choice prediction 
situation. 

Previous studies have amply demonstrated 
the efficiency of various social responses as 
reinforcers for a variety of classes of behavior. 
Typically, however, the social reinforcer has 
not been paired with an objective indication 
of performance adequacy—‘‘social reality’’ 
has been the only basis provided for evaluat- 
ing performance (e.g., Asch, 1956; Festinger, 
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1954 Che present study indicates that for 
the limited set of conditions employed, at 
least, a social reinforcer is much less effective 
than an objective 
indication of response adequacy. The efficacy 
of the emotional response of the partner was 
enhanced by making it instrumental to other, 
remote, goals but even under such 
conditions it was less effective in modifying re- 
sponse probabilities than a task reinforcement. 


in modifying behavior 


more 
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RUNNING SPEED IN RATS AS A FUNCTION OF DRIVE 
LEVEL AND PRESENCE OR ABSENCE OF 
COMPETING RESPONSE TRIALS! 


GEORGE A. CICALA? 


Princeton University 


In a recent paper, Estes (1958) has 
offered a tentative theory of motiva- 
tion based on the proposition that 
“drive stimuli are simply stimuli with 
no special properties whatever” (p. 
42). This view contrasts sharply 
with that of Hull (1951) who at- 
tributes to drive stimuli both stimu- 
lus properties and special energizing 
properties. 

In Estes’ theoretical analysis of 
drive he argues that increases in run- 
ning speed as a function of increased 
drive level are due not to the ener- 
‘ gizing properties of drives, but rather 
to the differential sampling proba- 
bility of drive stimuli and the effects 
of extraneous stimuli in the learning 
situation. He maintains that during 
acquisition both differential sampling 
of drive stimuli and extraneous stim- 
uli are responsible for differences in 
running speed usually found at dif- 
ferent drive levels. At the asymptote 
of training he postulates that all drive 
stimuli are conditioned to the relevant 
goal response, and that differences in 
running speed are solely due to the 
distracting effect of extraneous stimuli. 

In an experiment relating to these 


1This research represents, in part, a 
dissertation submitted to the Princeton 
University faculty in partial fulfillment of the 
requirements for the PhD degree. The author 
wishes to express his thanks to Byron A. 
Campbell for his continuing encouragement 
and assistance throughout all phases of this 
work. While this research was being under- 
taken, the author was a Public Health Service 
Terminal Year Fellow. 

2 Now Public Health Service Postdoctoral 
Fellow, Princeton University. 


assumptions as they apply to asymp- 
totic performance, Cotton (1953) 
compared asymptotic running speed 
on all trials versus trials on 
which extraneous responses were not 
observed to occur. This technique 
of analysis was based upon the ra- 
tionale that although it was impos- 
sible to exclude from the experimental 
situation all extraneous stimuli, this 
effect might be accomplished by 
removing from the data all trials 
on which behavioral evidence indi- 
cated the of such stimuli. 
Cotton found that when all trials were 
included in the running 
speed was an increasing function of 
the period of deprivation. When 


those 


presence 


analysis, 


trials on which competing responses 
occurred-were excluded from the anal- 
ysis, however, .this effect was severely 


diminished. This finding was inter- 
preted by Estes (1958) as supporting 
the hypothesis that differences in run- 
ning speed at different drive levels 
were attributable to the presence of 
unconditioned, extraneous stimuli. 


EXPERIMENT | 


This experiment was designed to 
study the acquisition of a running 
response when competing response 
trials are excluded from the data. 
The main interest in this type of 
analysis stems from the point of view 
that acquisition of an instrumental 
response may be entirely a function 
of the elimination of competing re- 
sponses, although Estes does not sug- 
gest this as a specific possibility. A 
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secondary aim of this study was to 
test (1958) prediction that 
learning curves under different levels 
of deprivation will converge as the 
asymptote of training is approached 
when competing response trials are 
removed from the data. 


Method 


Subjects—The Ss used in this study were 
48 male albino rats of the Wistar strain, 
between 90 and 100 days of age at the start 
of the experiment. 

A pparatus.—The apparatus consisted of a 
4-ft. alley with a 1-ft. startbox and four 
removable goalboxes 1 ft. in length. The 
alley, start, and goalboxes were 6 in. wide. 
The sides were constructed of aluminum, 
the floor of stainless steel grids, and the top 
was of Plexiglas to permit E to observe the 
animal while it was traversing the alley. 
Running time was recorded on a Standard 
Electric timer by means of photocells mounted 
2 in. from each end of the alley. 

Procedure.—Initially, Ss were placed in 
individual cages and given ad lib. access to 
food and water for 3 days. They were then 
assigned to four groups equal with respect to 
weight, and placed on a restricted diet of 
10, 15, 20, and 25 gm. of Purina laboratory 
chow which was placed in a food cup in the 
home cages at 3 P.M. each day. After 11 days 
of adaptation to this schedule, Ss were con- 
ditioned to run a straight alley for a constant 
reinforcement of six 45-mg. food pellets 
manufactured by the P. J. Noyes Company 
The Ss were run between 9 A.M. and 12 N. 
each day. 


Estes’ 


[The daily procedure was as follows: 4 
Ss were removed from their home cages and 
placed in the four goalboxes. The S was 
removed from the goalbox and placed in the 
stationary startbox, after which the goalbox 
was positioned at the end of the alley. The 
S was permitted 3 min. to enter the alley, 
after which it was pushed into the start 
section of the alley and permitted 3 min. to 
enter the goalbox. If it had not entered the 
goalbox in the 3 min. alotted, it was placed 
in the goalbox and left there until all six 
pellets were consumed. Each S of the squad 
was run once on each rotation until each of 
the 4 Ss had received its appointed number 
of trials for that day. 

The Ss were given 58 trials in the runway. 
One trial was given on Day 1, two on Day 2, 
three on Day 3, and four on each subsequent 
day for 12 days. Running times were meas- 
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ured for each trial, and E made a graphic plot 
of each S’s path and competing responses as 
it went down the alley. Following Cotton’s 
(1953) procedure, a “competing response” 
was defined as any trial on which S stopped, 
scratched, or exhibited any behavior incom- 
patible with runway traversal. 

The experiment was run in four sections. 
Each part consisted of the following depriva- 
tion groups. Part 1: 10 and 20 gm., Part 2: 
10 and 20 gm., Part 3: 15 and 25 gm., Part 4: 
25 gm. The total number of Ss at each 
deprivation level was 12. Since age and 
environmental conditions were controlled 
during training, the fact that deprivation 
levels were not equally distributed over the 
course of the experiment was considered not 
an important design deficiency. 

In order to establish the reliability of E’s 
judgment concerning S’s behavior on each 
trial, samples of trials in all deprivation 
groups, as well as for different stages of acqui- 
sition, were run with two observers. Out of 
the total of 125 trials (75 with a trained 
psychologist as the O and 50 with a laboratory 
technician trained in biology) the observer 
agreed with E as to whether or not a com- 
peting response had occurred 119 times 
(95% agreement), indicating a high degree of 
accuracy in judging the presence or absence 
of competing responses. No difference in 
reliability was found between the untrained 
technician and the two psychologists. 


Results and Discussion 


Figure 1 shows the main results 


of Exp. I. The left-hand portion of 
the figure shows mean running speed 
in the alley when both competing and 
noncompeting response trials are in- 
cluded in the analysis, and the right- 
hand portion shows running speed 
when only noncompeting response 
trials are included in the analysis. 
At the asymptote of training (Days 
10-16) all differences between once- 
removed groups (e.g., 10 gm. vs. 
20 gm.) were significant at beyond the 
.01 level (the Mann-Whitney U, two- 
tailed test was used for this and all 
subsequent difference tests) for both 
analyses. Another important aspect 
of the data is presented in Fig. 2 
which shows the mean number of 
competing responses observed for each 





RUNNING SPEED IN 


RATS 





WITH COMPETING R's 


RUNNING SPEED (Ft./ Sec.) 


4 4 





| WITHOUT COMPETING R's #« 
td ‘. 
-» 3 » 


- ¢ 
, t-.¢ 





e----« 


10 GM. 
15 GM. 
o@----- 20 GM 


——o 25 GM. 


oo 








a ot ofl peeclieeie 
789 Ol 2 13 4 15 16 


4 4 4. A. 4. i 4 4. 4 = 
W234 9 OW t2 13 1415 16 


CONDITIONING DAY 


an running speed for all trials (left) and for trials on which no 
competing response occurred (right): Exp. I. 


group for each block of four trials 
during acquisition. For each point, 
then, the maximum number of com- 
peting responses was 48. Clearly, the 
number of noncompeting 

trials varies with drive level. 


response 

Differ- 
ences between once-removed groups 
were significant at the .02 level or 
better. In short, the data indicate 
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Fic. 2. Mean number of competing re- 
sponse trials over the course of acquisition: 
Exp. I. 


that the acquisition curve is not 
entirely determined by the dropping 
out of competing responses during the 
early phases of learning. Although 
the number of competing responses 
decreases over the course of aequisi- 
tion, the removal of competing re- 
sponse trials does not serve to decrease 
the divergence of the acquisition func- 
tions for the different drive levels. 


EXPERIMENT I] 


In Exp. I it was noted that the 
removal of competing response trials 
did not cause a greater convergence 
as the asymptote of training was 
approached than was found when all 
the trials were included in the analy- 
sis. Because of this finding, Exp. II 
was designed to study running speed 
with and without competing response 
trials as a the 
drive at the asymptote of training. 
Specifically, it was hoped that this 
study would provide a test of Estes’ 
(1958) assumption that, at the asymp- 


function of level of 
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tote of training, speed on trials without 
competing responses will not vary as a 
function of drive level. 


Method 


Subjects and apparatus.—The Ss used in 
this experiment were 21 male albino rats of the 
Wistar strain, about 130 days of age at the 
beginning of the experiment. The apparatus 
used was the same as that employed in Exp. I. 

Procedure-—The Ss 
matching according to 
groups of 7 Sseach. Group LD (Low Drive) 
was fed for 2 hr. before the daily running 
time (7 A.M. to 9 a.m.). Group HD (High 
Drive) 2 hr. after the regular 


were divided, by 
weight, into three 


was fed for 2 
running time (1 P.m. to 3 p.M.), and Group A 
(Alternated) was alternated daily between 
the before and after feeding time. For pur- 
poses of presentation Group LD was con- 
sidered to be 0 hr. deprived, and Group HS 
18 hr. deprived. It should be noted that 
using this feeding schedule, all Ss received 
roughly the same amount of food each day. 
This technique resulted in weight changes 
which were constant among the groups. 
Following 16 days of adaptation to this 
method of feeding, Ss were given one trial 
per day in the runway for a period of 4 
days. Asin Exp. I, each traversal of the run- 
way was regularly reinforced with six 45-mg. 
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food pellets. After the first 4 days of running, 
10 days of two trials per day were given 
followed by 32 days of four trials per day. 
Running speeds were measured, and trials 
on which competing responses occurred were 
recorded. 


Results and Discussion 


The left-hand portion of Fig. 3 
shows mean running speed per day 
for the last 11 days of training, 
Trials 108-152, when all trials includ- 
ing both competing and noncompet- 
ing response trials are used in the 
analysis. The asymptotic nature of 
performance at this stage is indicated 
by the lack of any significant trend 
toward higher or lower running speeds. 
It will be noted that Group HD runs 
consistently faster than Group LD 
(P = .05). Group A shifts its per- 
formance consistently with the daily 
drive condition (P = .07) between 
the before and after feeding conditions. 

When competing response trials 
are excluded from the data, as shown 
in the right side of Fig. 3, Group HD 





WITH COMPETING R's 


RUNNING SPEED (Ft./ Sec.) 





i A. 


WITHOUT COMPETING R's 


o——e HIGH DRIVE 

e-----~- LOW DRIVE 

o——o ALTERNATED 
DRIVE 








i 4 A 





36 37 36 39 40 4) 42 43 44 45 46 


a — a 
36 37 38 39 40 GI 42 43 44 45 46 


CONDITIONING DAY 


Fic. 3. 
occurred (right): Exp. II. 
running on even-numbered days.) 


Mean running speed for all trials (left) and for trials on which no competing response 


(Group A was fed before running on odd-numbered days and after 





RUNNING SPEED IN 


COTTON'’S DATA 


(Ivy) 





RUNNING SPEED (FT./ SEC.) 








ALTERNATED DRIVE NON-ALTERNATED 


RATS 


GROUP DRIVE GROUPS 


o—e #ITH COMPETING R's 
o—o WITHOUT COMPETING R's 


>» oo & 
@ 2 oOo * 


= 





RUNNING SPEED (FT./ SEC 
x 





RECIPROCAL RUNNING TIME 





HOURS OF DEPRIVATION 











Fic. 4. 


The comparison between Cotton's data and the results of Groups 


A, HD, and LD of Exp. II. # 


remains consistently superior to Group 
LD (P = .04). Again, it should be 
noted that the functions are asymp- 
totic as indicated by the lack of any 
significant trend. That the removal 


of competing response trials fails to 
diminish the differences in running 
speed betwéen Groups HD and LD 


indicates that the differences were 
not entirely determined by the pres- 
ence of extraneous cues. This con- 
clusion is clearly contrary to Estes’ 
(1958) assumption that at the asymp- 
tote of training, running speed, when 
competing response trials are excluded 
from the data, does not increase as a 
function of increased drive level. It 
appears, rather, that running speed 
varies as a function of drive level 
even when competing response trials 
are excluded from the analysis. 

The results of Group A, however, 
appear to confirm Estes’ (1958) predic- 

*The plot of Cotton’s data was madé 
from the reciprocal of the mean running 
times in Group II, Test Period II (Cotton, 
1953). The data were supplied in a personal 
communication. The graph appears to be 


the same as that presented by Spence (1956, 


p. 171). 


tion, in agreement with Cotton’s (1953) 
results, since the removal of competing 
response trials reduces greatly the differ- 
ences in running speed between the two 
drive conditions. This relation is shown 
more clearly in Fig. 4 which compares the 
results of Group A and the nonalter- 
nated groups of the present study with 
the findings reported by Cotton (1953). 
For the two graphs plotted from the 
data of the present study, only the 
postasymptotic performance (Days 36 
46) is included. As can be seen, when 
the data from Cotton's study are com- 
pared with the data from Group A of the 
present study, the functions obtained 
are nearly identical. However, when 
a similar plot is made for Groups HD 
and LD of the present study, the removal 
of competing responses does not reduce 
the slope of the function. 

It should be remembered that Cotton 
(1953) gave all his Ss extensive training 
at all drive levels before measuring the 
asymptotic response of each S under 
each drive level—a procedure very simi- 
lar to that used with Group A in the 
present study. This 
selected, according to Cotton, in order 


procedure was 
to eliminate the possible effects of differ- 
ential drive stimulus generalization by 
permitting the discrimination of drive 
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stimuli to occur. It is not clear, how- 
ever, that overtraining at different drive 
levels eliminates the problem of differen- 
tial drive stimulus generalization. Cot- 
ton cites no evidence to that effect, 
and the experimental evidence available 
does not appear to be conclusive. While 
there is evidence that stimulus 
discrimination can occur in the absence 
of differential reinforcement (Bitterman, 
Calvin, & Elam, 1953; Bitterman & 
Elam, 1954; Bitterman, Elam, & Wortz, 
1953), this effect has not 
strated with 
Even where 
was 


some 


been demon- 
respect to drive stimuli. 
differential reinforcement 
used (Bloomberg & Webb, 1949; 
Jenkins & Hanratty, 1949), drive stimu- 
lus discrimination requires a great many 
trials before it 
differential responding. 

In the light of these considerations, 
it may be that overtraining at alternated 
drive levels does not eliminate the pos- 
sibility of drive stimulus generalization, 
and, accordingly, that drive stimulus 
generalization accounts, in part, for the 
differences between the alternated 
nonalternated groups. 


becomes manifest in 


and 


SUMMARY 


Two experiments, utilizing Cotton’s (1953) 
technique of analyzing noncompeting response 
trials separately, were performed to determine 


the role of extraneous stimuli in the acquisi- 


performance of an instrumental 
running response. In Exp. I running speeds 
were measured during acquisition with 
amount of food deprivation as the parameter. 
When competing response trials were re- 
moved from the data, normal acquisition 
functions still obtained and no greater con- 
vergence of the drive functions was found 
than when all trials were included in the 
analysis. In Exp. II running speeds were 
measured for a High, Low, and Alternated 


tion and 
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Drive Group. For Groups HD and LD, 
removal of competing response trials did not 
reduce differences as a function of drive. For 
Group A, however, this operation served to 
reduce between the two drive 
conditions. These results were interpreted 
as supporting the view that asymptotic per- 
formance varies as a function of drive and 
that this relation obtains even when compet- 
from the 


differences 


ing response trials are removed 


data. 
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The present study synthesizes sev- 
eral recent theoretical trends bearing 
on paired-associate learning by postu- 
lating that learning a single pair may 
involve, not just a single link between 
the stimulus member, S, of the pair 
and its response member, R, but 
rather a chain of habits involving at 
least three links. 

The first postulated link in this 
habit chain involves associating with 
each S a mediating, stimulus-pro- 
ducing response, r. This r is a partial 
representation of the S to which it 
becomes connected, encoding some 
aspect of the S which discriminates 
that S from the others in the list. 
Evidence that S does tend to encode 


the S only partially and, more spe- 
cifically, only with respect to aspects 


that usefully distinguish it from 
other Ss in the list, found in 
the stimulus predifferentiation studies. 
Prelearning a set of Rs to a set of Ss 
is found to result in positive transfer 
when a new set of Rs are then learned 
to the Ss provided the Ss are distin- 
guished by the same aspects on both 
tasks (Goss, 1953). If they are 
distinguished by a new aspect, how- 
ever, there is no transfer if the old 
aspect is missing .on the new task 
(Hake & Ericksen, 1956); and nega- 
tive transfer results when the old 


is 


1This paper is based on a dissertation 
submitted in partial fulfillment of the require- 
ments of the PhD degree at Yale University 
in 1954. The author is greatly indebted to 
Carl I. Hovland for his advice and guidance 
and also to C. E. Buxton, R. P. Abelson, and 
N. E. Miller. 

2 Now at the Department of Social Psy- 
chology, Columbia University. 


aspect is present and interferes with 
the encoding of the alternative aspect 
that has been made significant on 
the new task (Kurtz, 1955). 

The difficulty of this first habit 
component (S-r) can be manipulated 
by varying the physical similarity 
(primary generalization) of the set 
of Ss, as is done in the present study 
and, in principle, by Goss (1953), or 
by varying the number of irrelevant 
dimensions, as in the typical concept- 
attainment study (Pishkin, 1960) in 
which S must learn which the 
relevant dimension along with the Ss 
can be discriminated. 

The second link in the habit chain 
involves learning to make the appro- 
priate gross response, R, to each 
stimulus, s, produced by the mediat- 
ing, labeling response. That the R 
of the pair tends to become associated 
with this mediating s rather than to 
the S itself has been demonstrated 
by Bugelski and Scharlock (1952) for 
experimentally established mediators 
and by Russell and Storms (1955) 
for those provided by language habits. 
That the mediators tend to be the 
discriminating labels attached to the 
Ss is demonstrated by McAllister’s 
(1953) finding that stimulus pre- 
differentiation results in positive trans- 
fer to the extent that the differentiat- 
ing labels (r) are “relevant’’ to the 
new Rs to be learned for the Ss. 

The third link in the habit chain 
which we postulate involves the possi- 
bility that the response member of 
the pair, R, may itself be a chain of 
habits that have to be learned. The 
importance of this ‘‘response learning”’ 
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has been reviewed by Mandler (1954) 
and most recently demonstrated by 
the response-meaningfulness (m’) stud- 
ies (Hunt, 1959; Nobel & McNeely, 
1957) and the _ response-similarity 
studies (Feldman & Underwood, 1957; 
Underwood, Runquist, & Schulz, 1959) 
which deal with two somewhat dif- 
ferent aspects of R learning—within-R 
synthesis, and between-R discrimina- 
tion. 

In summary, learning each pair in a 
paired-associate task involves, not a 
single S-R connection, but rather 
three component habits. The first 
involves S and r; the second, s and R; 
and the third (which may be a 
chain of habits) the R 
R,, Rs,-:-Ra. Hence, 


elements, 
learning a 


pair may involve the following habits: 


sa R, 


S—r—s—R,—s,— R,, > s,—> - * 


We are ignoring here the remote asso- 
ciations that probably get formed 
and which may account for some slight 
discrepancies from the _ predictions 
reported below. 

The materials used in the present 
study were designed to enable us to 
identify in each erroneous instance 
(that is, a failure by the S on any 
trial to respond to a given S with the 
correct R) which of the links in the 
chain failed in that instance. By 
such an analysis of the errors we can 
determine the practice functions, not 
only of the usual H, (the proportion 
of pairs correctly anticipated on a 
given trial), but also of H, (the pro- 
portion of correct S discriminations, 
i.e., S-r habits, on a given trial), of H» 
(the proportion of correct associations, 
i.e., S-R, habits), and of Hs (the pro- 
portion of Rs correctly synthesized, 
-—R,, habits). 

Besides indicating processes in- 
volved and sources of difficulty in 
paired-associate learning, these sepa- 
rate indices compute 


i.e., Ra—s,—- - 


enable us to 
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independently the curves for stimulus 
generalization and for intrusion errors, 
the confusion betweenswhich -- has 
given rise to that hardy perennial 
among verbal learning controversies, 
the temporal trend of stimulus gener- 
alization during practice, which has 
recently blossomed again among Mur- 
dock (1958, 1959), Battig (1959), 
Gibson (1959), and Runquist (1959). 


METHOD 


Materials and experimental variations. 
The Ss and Rs in this study were designed to 
allow independent manipulation of the 
difficulty of each of the three sets of habits, 
Hi, He, and Hs3. 

Each S learned, by the anticipation method, 
a list of nine paired associates. The Ss of 
these pairs were nine, solid black circles of 
varying diameters. Two different sets of 
circles were used for different groups of Ss. 
In Set I the diameters ranged from .37 to 
1.49 cm. in .14-cm. steps; in Set II, from 
.37 to .93 in .07-cm. steps. The difficulty of 
learning the S-discrimination habits (H,) was 
defined as being greater for Set II than for 
Set I. 

The Rs were numbers. For some Ss these 
Rs were the numbers from 1 to 9. For other 
Ss the Rs were 3-digit numbers, each begin- 
ning with a different integer between 1 and 9, 
and with the second and third digits assigned 
by a random procedure. The difficulty of 
learning -the R-chaining habits (H;) was 
defined as being greater for the 3- than the 
1-digit Rs. In fact, for the 1-digit numbers, 
H; was defined as having a value of 1.00 
from the outset of the experiment; that is, 
it was assumed that S would always perform 
perfectly at this segment of the task. 

The difficulty of learning the s-R, associa 
tion habits (H2) was manipulated by the 
manner of assigning Rs toSs. An advantage 
of using sets of Ss that differ along a single 
linear dimension, size in this case, is that it 
permits hypothesizing what the discriminat- 
ing labeling rs will be. With the nine dif- 
ferent-sized but otherwise similar Ss used here 
and with the instructions to S described below, 
it was assumed that S would use a numerical- 
type labeling response; for example, he would 
tend to label the smallest circles as ‘‘one”’ 
the next larger as “two,” etc., up to the largest 
which he would label as “nine.’’ For some 
Ss the Rs were assigned consecutively so that 
the successively larger numbers were assigned 
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TABLE 1 


DESCRIPTION OF DIFFICULTY LEVELS OF COMPONENT HABITS IN THE S1x CONDITIONS, 
WITH FORMULAS FOR PROPORTIONS OF CORRECT ANTICIPATIONS (H,) AND OF 
INTRUSION Errors (IE) 


Task 


Hi He Hs; 


H,H2H; easy easy easy H, 
H,’HeH; hard easy easy H,’ 
H,H2H;’ easy easy hard H,-H;’ 
H,'H2H;’ hard easy hard H,’-H;’ 
H,’H.’H hard hard easy H,’-H,’ 
H,’H,2’H;’ hard hard hard H,’-H.’-H;’ 


* These H. and IE formulas are based on the general equations H Hi-He-H; and IE = Ha(1 — H:-H») 
and the assumptions that the values of Hz and H:, without primes, equal 1.00 from the beginning of practice. 


to progressively larger circles. For other addition, they were told that all Ss were solid 
groups, Rs were assigned at random to Ss black circles, differing only in size, and that 
so that no simple system related size of num- __ there were nine different pairs in all. Further, 
bers and size of circles. The difficulty of | those in H; groups were told that the Rs 
learning the association habits (Hz) was, of | would be the 1-digit numbers from 1 to 9, 
course, defined as being greater with the while those in H;’ groups were told that the 
latter, random method of assignment. In Rs would be nine 3-digit numbers, each 
fact, in the conditions with consecutive as- beginning with a different integer from 1 to 9. 
signment of the numerical responses, H2 was Subjects in He conditions were given the 
defined as equaling 1.00 from the outset of additional information that progressively 
practice; that is, it was assumed that Salways higher numbers were assigned to successively 
performed this segment of the task perfectly. larger circles, while those in H2’ conditions 

Experimental design.—An incomplete 2* were told that the magnitudes of the Rs of the 
factorial design (minus one quadrant) was pairs were randomly related to the sizes of the 
employed. The three variables were difficulty circles. All Ss were instructed to try to give 
of learning the H;, Hz, and H; sets of habits, | some response to each S after the completion 
there being an easy and difficulty level of of the first trial. Those in H,;’ (3-digit R) 
each as defined above. Table 1 shows the conditions were further directed to respond 
levels of each of these factors in the six with at least the first digit even if they could 
conditions used, with primes (e.g., H,’) not give the othertwo. Asa further measure 
indicating the difficult condition and no _ to keep omissions at a minimum, all Ss were 
primes (e.g., H:) indicating the easy condi- told that when they were not sure of a correct 
tion on the given link. Ten Ssservedineach R they should guess. 
of the six conditions, and each S served in only 
one condition. mental session for each S consisted of 80 

Apparatus.—A Hull memory drum with continuous trials without any intertrial pause 
1-in.-square windows was used. The Ss were (48 min.). Only 1 S asked to be relieved 
presented in the left window, and the Rs in before completing 80 trials; he was replaced 
the right window. The anticipatory method by the next S in the pool. 
was employed, S appearing alone for 2 sec. 
and then S and R appearing together for 2 sec. 
ifter which both windows closed again. 
\lmost immediately, the left window re- 
opened, showing the next S on the list. The S 
was required to anticipate the R within the 
first 2-sec. period. The pairs came in a 
different order on each trial. 

Instructions to S.—The Ss were given the RESULTS AND DISCUSSION 
usual instructions for a paired-associate task : = , : 
regarding how the materials would be pre- Correct anticipations (H,.): Group 
sented and what they were to try todo. In means.—The manipulation of the dif- 


Length of the learning session.—The experi- 


Subjects.—The 60 Ss were selected on the 
basis of availability from a pool of several 
hundred college students in an introductory 
psychology course and assigned to one of the 
six conditions in accordance with a random 
order. The writer served as E for all Ss. 
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ficulty of each of the three postulated 
sets of habit links (H,, He, and Hs) 
did produce the predicted difference 
in the mean proportion of correct 
anticipations per trial (H.) over the 
80 trials.* 

The difficulty of forming the first 
habit (S-r), involving 
S discrimination, was manipulated by 
using .14-cm. steps between the Ss in 
some conditions (H,;—) and .07-cm. 
other (H,’—). 
Insofar as this set of habits is involved 
in learning the paired associates, per- 
formance (H,) in Cond. H,H2H; 
should have been superior to that in 
Cond. H,’H2Hs; since the only differ- 
ence between the two tasks was that 
the more dif- 
ficult in the latter’ conditions. For 
like reasons, performance on Cond. 
H,H2H,’ should have been superior to 
that in Cond. H,/H2H;’. Both of 
these predictions are confirmed by the 
results. The mean H, value for the 


set of links 


steps in conditions 


S discrimination was 


10 Ss over all 79 anticipatory trials 
is significantly higher (see Table 2) 


in Cond. H,HsH; than in 
(P < .001) and in Cond. 
than in H,’H2H;’ (P = .02). 

The difficulty of learning the habit 
chains of R elements was varied by 


H,’H2H; 
H,H2H,’ 


3’The H. symbol has been deliberately 
selected in the present model for its resem- 
blance to the Hullian habit strength symbol, 
sHr, because the constructs are similar in that 
both are negatively accelerated increasing 
functions of the number of reinforced S-R 
pairings. However, the H, differs quantita 
tively from the Hullian sHp in three respects 
First, the present H, is a direct one-to-one 
function of the probability of correct response, 
whereas sHp is an intervening construct that 
is an exponential function of this probability. 
Secondly, sHpz is the strength of a single S-R 
habit, while H, is the mean strength of a set 
of as many habits as there are pairs. Thirdly, 
the present H. need not be zero when N = 0 
unlike Hull (1943, 1952), we allow 
the a and ¢ parameters in 
H, = c — ae to 
(Lewis, 1960). 


since, 
the equation 
values 


assume different 
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TABLE 2 


PROPORTION OF CORRECT ANTICIPATION 
RIAL (H,) IN EACH OF THE 
Six CONDITIONS 


Cond. 


H,H.H .793 
635 
618 
191 
462 
340 


Mean 


mean is based on 10 S 


s per trial. 


Rs in some conditions 
others 


scores 


using 1-digit 
(—H;) and 3-digit Rs in 
(—H,’). Hence, higher H, 
are predicted in Cond. H,H-eH; than 
in H,H2H;'; in Cond. H;’H2H; than 
in H,’HeH;’; and in H,’H.’H; than 
in H,’H.’H;’. All the obtained H, 
differences lable 1) are in the 
predicted direction and the three 
differences are significant at the .001, 
.001, and .01 levels, respectively. 

The difficulty of forming the third 
set of habit links, the s-R, or 
ciation” habits, was manipulated by 


(See 


*“asso- 


assigning Rs to Ss consecutively in 
some conditions (—H»s—) and ran- 
domly in others (—H.’—). Hence, 
the H. scores should be higher in 
Cond. H,'HoH; than H,’H2’H; and in 
Cond. H,’H2H;’ than H,;/H.2’H;’. Both 
mean differences are in the predicted 
direction Table 2) and are 
significant at the .001 and .01 levels, 
respectively. 

So far only the directions of the 
differences in H, scores between the 
conditions have been considered, but 
the relative sizes of these differences 
are also of interest. In Table 1 the 
equations of H, in each of the six 
conditions are given. Solving for H;’ 
in each of the equations where it 
appears, it is found that in Cond. 
H;H:H;,’, Hs;’ H, H,; in Cond. 
H,’H2H;’, H;’ H, H)’: 


(see 


and in 
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Cond. H,;’Ho’H;’, H3’ =H. + (Hy: H,’), 
the value of H, in each case being that 
for the given condition. But the 
divisors of the right hand members 
of each of these three equations can 
be seen in Table 1 to be the H, 
values of, respectively, Cond. H,H2Hs, 
H,’HoH3, and H,’/H.’H;. Since each 
of the three fractions is equal to the 
same thing, H;’, then the following 
equation can be derived: 


H, of Cond. 
~ H. of Cond. 
H. of Cond. 
~ H, of Cond. 
H. of Cond. 
~ Hi. of Cond. 


HHH,’ 
HHH; 
H,’H2H,’ 
H,’H.H; 
H,’H.’H,’ 
H,’H.’H; 


H,’ = 


Substituting the obtained mean H, 
values (shown in Table 2) in this 
equation, the values of these three 
fractions are found to be .779, .773, 
and .736, respectively. 

We would expect these three quo- 
tients to be equal to the extent that 
the learning of any one of the three 
postulated sets of habits proceeded 
independently of the difficulty of the 
other two sets of habits being ac- 
quired simultaneously. The closeness 
of the three obtained quotients sug- 
gests that there is considerable inde- 
pendence. The closeness of the .779 
and .773 values, particularly, suggests 
that the difficulty of learning the 
response chains is only negligibly 
affected by the difficulty of the S 
discriminations being learned con- 
currently. The third value, .736, is 
slightly (about 5%) lower than the 
other two. While the first two values 
give a derived measure of H;’ in 
consecutive S-R assignment condi- 
tions, this third gives an H;’ value ina 
condition where the numerical values 
of Rs are randomly related to the 
size of Ss to which they are assigned. 
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Hence, the learning of the response 
chain may be (very slightly) impeded 
by the difficulty of the s-R association 
habits being learned concurrently. 
This interaction might be expected 
in view of the similarity of the s-R 
habits and the R,-s,—- --R, habits 
in the present study (both involve 
linking numbers) and of the usual 
finding that difficulty of learning 
is a positively accelerated function of 
amount of material (Hovland, 1940). 

These derived estimates of H,’ in 
Cond. H,H2H,;’ and Cond. H,’H2H,’ 
can be checked by an independent 
method of computing H;’ which 
requires us to anticipate the classifica- 
tion of errors discussed below. Briefly, 
the sum of Categories A and C in 
Table 3 should yield a running meas- 
ure of H;’, the number of R chains 
correctly learned. The mean H;’ for 
Cond. H,’H2H;’ computed in this way 
is .778 (as compared to the .773 mean 
yielded by the quotient method just 
The mean for Cond. 
H,H:H;’ is .800 (as compared with 
the .779 mean yielded by the quotient 
method). 


discussed ). 


The slight discrepancy in 
each condition, between the summa- 
tion and the quotient method of de- 
riving H;’, falls far short of conven- 
tional significance levels. 

The H, learning curves——We can 
obtain a more analytic understanding 
of the effect of manipulating H,, He, 
and H; on proportion of correct anti- 
cipation (H,) if the group trends 
during practice in each of 
conditions are considered. To calcu- 
late these functions, the 
group mean H, values in each condi- 
tion over the 80 trials were used to 
determine 13 practice points, as fol- 
Trial 1 omitted in the 
determination of the fitted curves, 
since this trial constituted an initial 
presentation to allow S to see each 
pair at least once before being required 


the six 


practice 


lows. was 
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to anticipate. 


segrry 


The first point (point 
is based on scores on Trials 2 
and 3, the next point (point ‘5’’) 
is based on Trials 4, 5, and 6; the 
next (point ‘8’’) on Trials 7, 8, 9, 
and 10; and each of the next 10 points 
(called 14, 21, 28, 35, 42, 49, 56, 63, 
70, and 77, respectively) on succes- 
blocks of seven trials. For 
example, point 77 is based on the 
means of Trials 74 through 80. More 
points were calculated in the case of 
the earlier trials in order to determine 
more accurately the shape of the 
function near the beginning of prac- 
tice, when improvement most 
rapid. The scores at each practice 
point for each of the six groups are 
plotted in Fig. 1, which also shows 
the fitted function for each condition. 

All of the functions shown in Fig. 1 
are of the inverse exponential family ; 
H, = c-ae~’’, where H, is the propor- 
tion of the nine possible anticipations 
given correctly on any trial, c-a is the 
proportion of correct anticipations at 
the outset of practice, c is the asymp- 
tote, and b, the growth parameter. 
Exponential functions were used be- 


sive 


was 


(H,HH3) 


(H.) 


4 2na, 
HHH, 
H/H,H 


H/H.H, 





PROPORTION OF CORRECT ANTICIPATIONS PER TRIAL 





28 35 


A. BLOCKS OF TRIALS (N) 


Mean proportion of correct anticipations (H,) on each block of trials. 


J. 


McGUIRE 


cause they described the data more 
accurately than any straight line 
in each of the six conditions by an 
amount that was significant at the 
.O5 level in all cases except Cond. 
H,’HeH3. In Cond. H,HeHs; and 
H,’H2H; we also fitted hyperbolic 
and Gompertz functions to the data, 
but these functions were found 
describe the data somewhat 
adequately than the exponential. 
Cond. H,H2H;’ was the only one of 
the six in which the data deviated 
from the fitted exponential function 
by an amount that approached the 
.05 level. The test for goodness of 
the fit was based on Lindquist’s Case 8 
(1947) with df = 10 (i.e., 13 — 3, the 
number of trial blocks minus the 
number of parameters calculated for 
each function). 

Some criticism has been published 
by Sidman (1952) and Bakan (1954) 
regarding the practice of inferring 
the shape of the individual functions 
from that of the group functions, 
especially in the case of exponential 
functions. 
ing on the 


to 


less 


The criticism has no bear- 
results reported in the 


na i 


2 
Y 


noe 


s 
J 
/ 


_— 








B. BLOCKS OF TRIALS (N) 


la 


(Figure 


shows curves for one-digit, and Fig. 1b, for three-digit response conditions. 
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present experiment since no inference 
is made regarding the exact shape ol 
the individual curves. 
is relevant at all, only when there is 
an appreciable variance among the 
growth parameters (b) of the indi- 
vidual curves, the b-N 
products second 


The objection 


when 
The 
condition does obtain in the present 
study, but whether there is a signifi- 
cant variance among the individual 
b values has not been computed (be- 


and 


are large. 


cause of labor involved in fitting 


exponential curves to the data of each 


of the 60 individuals). In the absence 


of more information on the b param- 


eters, caution should be exercised in 
inferring the shapes of the individual 
S curves in the present study. 


Effect of the experimental variables on 
the different parameters of the H, learning 
curves.—Several conclusions can be drawn 
from the functions obtained in the present 
study. Each of the three hypothesized 
sets of habits is found to play a consistent 
part in determining both the initial level 
of performance (the H,-intercept, equal 
to c-a) and the asymptote (c) of the 
functions. This conclusion follows from 
the obtained effect of increasing the 
difficulty of either the H,;, Hs, or H; 
task, which is found in all cases to lower 
both of these There are 
seven nonredundant pairs of conditions 
between which such a comparison is 
possible, and for both the ¢ and the c-a 
parameters, the above stated effect was 


parameters. 


obtained in all seven comparisons. 

The growth parameter (b) of H, ap- 
pears to be related to the difficulty of the 
component habits, H;:, He, and 
H3;, in a more complex way. In the first 
place, it seems that where the 
of greater difficulty 
greater similarity 


sets of 


source 
from the 
between the Ss, so 


derives 


that there are more generalization errors, 
then the rate of learning of the paired- 
associate task is slower as the difficulty 
increases (compare b parameters in Cond. 
H,HeHs and Hy,’He2Hs3 and in Cond. 
H,HeH;’ and H,’HeH;’). As regards the 


effect on the growth parameter of increas- 
ing the difficulty of the (HH) 
or of the (H.), the Bb 
parameters shown in Fig. 1 suggest that 
increasing the difficulty of one or both 
of these increases the rate of learning 
(compare Cond. H,H2H; with H,H2H;’; 
and Cond. H,;’HeH; with H,’H2H;’, with 
H,’H.’H3, and with H,’H,.’H;’), but that 
increasing the difficulty of just one (H. 
or Hs3) increases the learning rate more 
than does increasing the difficulty of 
both Hzand H; (compare Cond. H,’H:2H;’ 
with H,’H.’H;'; and H,’H2’H; 
with H,’H.2'H;’). These interpretations 
are post factum, however, and call for 
further investigation. 


R-chains 


s-R associations 


Cond. 


Practice curves of stimulus discrimina- 
tion and of intrusion errors.—An intru- 
sion error is defined as the giving of an 
R that is in the list but giving it to an 
S other than the one with which it is 
actually paired in the list as presented. 
Hence, the probability of an intrusion 
(IE) is directly proportional to the 
learning of the R chains (H3) and 
inversely proportional to the learning 
of the S discrimination (S-r) habits (H;) 
and of the (s-R,g) 
(H.); that is: 


association habits 


An S-generalization error, on the other 
hand, was defined as giving the wrong r 
to an Sand measured by 1 — Hj, a value 
that will tend to differ from IE when 
either He, H3, or both are less than 1.00. 
The practice of defining stimulus gen- 
eralization during paired-associate learn- 
ing has, as was pointed out above, led 
to considerable controversy. An E may 
be allowed a certain liberty in defining 
his terms, as long as he does so clearly. 
However, this definition that S discrimi- 
when S dis- 
criminates the Ss with the Rs that E 
has arbitrarily assigned becomes 
leading, particularly as regards the 
practice curve of the generalization 
function and particularly when the Rs 
themselves or their relations to the Ss 
must be learned (that is, when Hs; or H:2 
increases with practice). 


nation has occurred only 


mis- 


Eleanor Gib- 
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son (1940, 1942), whose provocative re- 
using the IE definition of S 
generalization has given rise to this con 


search 


troversy, has herself pointed out (4 sibson, 
1959; Gibson & Gibson, 1955) the in- 
adequacy of this definition. 

The materials used in this study were 
designed to allow the responses given by 
Ss in each instance to be classified into 
separate categories in terms of which 
the practice curves of Hi, He, and Hs, 
as well as of H,, could be calculated. 
In this way, we could test the above 
formulations. Each of S’s 711 responses 
(9 pairs X 79 trials) was put in one of 
the following five categories: 


A. A first digit appropriate to the given S 
(indicating a correct discrimination) 
and the second and third digits appro- 
priate to the first (indicating a correct 
R). The proportion of responses in 
this category equals, by definition, He, 
the usual measure of a correct response 
in paired-associate learning. 

. A first digit appropriate to the given S 
(indicating a correct discrimination) 
and second or third digits not appro- 
priate to the first (indicating a wrong 
R). 

A first digit not appropriate to the 
given S (indicating a wrong discrimina- 
tion or a wrong s-R association or both) 
and second and third digits appropriate 


to the first (indicating a correct R). 


PROPORTION OF RESPONSES IN THE CATEGORY 


a 
. a 





© ne eee? see 9 
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The proportion of responses that falls 
in this category equals, by definition, 
IE. 

A first digit not appropriate to the 
given S (indicating a wrong S discrimi- 
nation or a wrong s-R association or 
both) and second or third digit not 
appropriate to the first (indicating a 
wrong R). 

E. An omission, no response given. 


The proportions of responses falling 
into each of these five categories for 
Cond. H,HeH;’ and H,’H2H;’ during 
successive practice intervals are shown 
in Fig. 2 and Table 3. Only for those 
two conditions are all categories used 
unambiguously. In Cond. H,iHe2Hs, 
H,’HeH3, and Hy,'H2’Hs3 where initial 
H; = 1.00, i.e., the Rs are one digit 
numbers, no Category B or D responses 
would occur. In Cond. H,’H.’H3 and 
H,’H2’H;’ (where Rs are randomly as- 
signed to Ss), a wrong first digit indicates 
that either the S discrimination, or the 
s-R association, or both, are wrong. 
Hence, in these two conditions the inter- 
pretation of Category C and D responses 
is ambiguous, and separate measures of 
H,’ and H,’ are not possible. 

Representing the proportion of re- 
sponses that fall into these categories 
by the letters A, B, C, D, and E, respec- 
tively, then in all six conditions, the pro- 








A, BLOCKS OF TRIALS 


B. BLOCKS OF TRIALS 


Practice trends of the mean proportion of responses in each of the five categories (as 
discussed in text) for Cond. H,H2H; (Fig. 2a) and Cond. H;'H2H;’ (Fig. 2b). 
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rABLE 3 


OBTAINED R 
CAaTec 


PROPORTION OF THI 


HiHeHs H 
Correct Response 
Correct S-r and s-R but 
inadequate R chain 
Incorrect S-r or s-R, but 
adequate R chain (=IE) 
Incorrect S-r, and 
inadequate R chain 
Omission .000 


Total 


Note.—Each me 
* Category B and 


1.000 


80 trials 
cur 


is based on over 
) response 


a 


n 
I 


portion of correctly learned R chains, Hs, 
A and C 
pairs that have been mastered to the 
of both S-r 

and s-R associations, 
equals A and B. 
above, the proportion of intrusion errors, 
IE = H3(1 — H,Hsz:), we find by substitu- 
tion that IE = (A+ C)(1 A B).4 


equals and the proportion of 


discrimi- 
H,- Hae, 


discussed 


extent 
nations 


correct 


Since, 


as 


‘ These 
where there are it is 
assumed that occur 
represent responses which, had they occurred, 
would all have been Category D responses, 
that completely wrong. In the present 
study a different assumption has been made, 
namely, that omissions represent 
that, had they 
divided among 
(B, a and D 
erroneous responses that were made 


equations apply in conditions 
no omissions. or where 
such omissions as do 


is, 


reé sponse Ss 
would have 


error 


been made, 
the other 
in the same proportion as the 


Hence, 
the B and C values used in the subsequent 


categories 


discussion are “corrected” values (represented 
by the B’ C’), equal to the 
obtained B and C values, plus a proportional 
of the E values, yielding the 
rected” equations of H,-H \ B’ 
IE (A+ C’)(1 \ B’). rhere 
still other assumptions regarding omissions 
that might have been made 
of omissions is important mainly in connection 
with Cond. H,’H:2’H;,’, since the number of 
was negligible in Cond. H,H2H; 
H,’HeH; and occurred in appreciable 
numbers in the other three conditions, only 
during the first few trial intervals. 


symbols and 


amount “cor- 


and 


are 


This discussion 


omissions 
and 


in the t 


SORIES IN |} 


1.000 


t 


SPONSES 


} 
i 


PHat FELL IN hay 


AcH CONDITION 


ti 


Conditions 


HiH:Hy’ Hi’H:Hy’ 


618 
094 


491 
.079 


182 


000 .023 025 .002 


1.000 1.000 1.000 1.000 


iree 1-digit R conditions. 


With these preliminaries understood 
it is possible to test the hypotheses con- 
cerning stimulus discrimination and in- 
trusion errors as a function of practice. 
In Cond. H,H2H; and H,’H2H; (the two 
conditions with 
one-digit Rs) the test is a direct one since 
initial H2 and Hs=1.00 and there are 
not omissions. Hence, H,.=H,-He-H; 
H, = Hi, in 


conditions the overall performance score 


consecutively-assigned 


becomes i.e., these two 
on the paired associates (H,) is identical 
with that of stimulus discrimination 
(H,). In both of these conditions there 
is a clear tendency for H; to increase 
monotonically throughout practice. The 
trends in both conditions are well fitted 
by rising, negatively accelerated, 
ponential functions (see Fig. 3). 
tend to confirm the hypothesis 
discrimination 


ex- 
These 
results 
that monoton- 
ically during practice with a set of paired 
to infirm the opposing 
hypothesis that this function first rises, 
then falls. 
Since initial 


increases 


associates and 


He and H; = 1.00 and 
no omissions occur in these two condi- 
tions (H,H2H; and H,’H2Hs), the equa- 
tion, IE = H;(1.00 — H,-H,), 
fies to IE 1.00 — H,, or its equivalent, 
IE 1.00 — Hg. be- 


tween the stimulus generalization curve 


simpli- 


This congruency 


and the intrusion error curve is a special 


case that occurs, as described by the 
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TABLE 4 


Correct S DISCRIMINATIONS 
PRIAL (H;) IN THE FIRST 
Four CONDITIONS 


PROPORTION O} 
PER 


Cond. 


H,H.H; 
H,’H.H 
H,H2H;,’ 
H,’H2H,’ 


SD 


.048 
.057 
101 


123 


Note.—Each mean is based on 10 Ss over 80 trials, 
pairs per trial. 


model, when initial H. and H; = 1.00 
and there are no omissions. When the 
IE curve coincide with that of 
stimulus generalization, it, as well as 
the latter, falls throughout practice. 

In the two conditions with consecu- 
tively assigned 3-digit Rs, HiH2H;’ and 
H,’H2H;’, the equation HiH2 = A + B’ 
reduces to H, A + B’ on the assump- 
tion that H», = 1.00 in the consecutive 
assignment conditions. (This assump- 
tion may be somewhat extreme, in which 
case the values of H; shown in Table 4 
and Fig. 3 would slightly underestimate 
the true values.) Figure 3 shows the 
practice curves for S discrimination in 


does 


97 


(H)H2H3) 


(H) 





PROPORTION OF S's CORRECTLY DISCRIMINATED 


TABLE 5 
PROPORTION OF INTRUSION ERRORS (IE) PER 
TRIAL IN EACH CONDITION 


Obtained 
Mean 


Predicted 


Cond, Mean 


Obtained 
SD 


H,H.H 
H,’H2H; 
H,H2H,’ 
H,’HeH;’ 
H,’H.2’Hs; 
H,’H.2’H;’ 


207% 
.365* 
.199 
305 
.436* 
.348 


.207 
365 
.182 
.287 
436 
.330 


Note.—Each obtained score is based on 10 Ss over 
80 trials, 9 pairs per trial. Predicted scores were derived 
from the equation: IE = (A + C’) (1 —A — B’). 

*In the three 1-digit R conditions, the “predicted” 
means necessarily coincide with the obtained means. 


these two conditions, which can be seen 
to follow the predicted monotonically 
rising trend, well fitted by negatively- 
accelerated exponential functions, rather 
than the sometimes reported rising, then 
falling curve. 

The intrusion error values (Table 5) 
are given by the formula IE = (A + C’) 
(1— A— BB’). These predicted prac- 
tice functions for IE in Cond. H,H:H;’ 
and H,’H:2H;,’ are shown in Fig. 4, where 
the obtained values (that is, number of 
category C responses) are also shown. 


H)H,H, .670-. 
H)H>H,’ 
H, HH, ).0173 N 


H.H.H : -0296 N 
"23 





28 


BLOCKS OF TRIALS 


_ 


35 


(N) 


Mean proportion of correct S discriminations (H;) on each block of trials for the four 
conditions in which this information was available. 
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14 21 28 


Fic. 4. 


42 49 56 63 70 77 


Practice trends for mean proportion of intrusion errors (IE) per trial block in the 
four conditions in which Category B or E responses occurred. 


(Lines indicate predicted 


trends; symbols, the obtained values in the given condition.) 


It can be that in each condition 
there is good agreement between pre- 
dicted and obtained scores, with respect 
to practice trend and maximum point, 
as well as to height. There is a slight 
tendency in both conditions for the 
predicted values of IE to overestimate 
the obtained values but these discrepan- 
cies yield F values of only 1.14 and 0.61 
in the conditions (using the Ss 
A Trials interaction as the error term), 
and therefore such discrepancies as these 
are can be attributed to chance. These 
IE curves in both conditions show an 
initial rise, followed by a leveling off and 
a slow decline, while the stimulus gen- 
eralization trend has just been seen to 
follow a falling course throughout prac- 
tice. This disagreement demonstrates 
that, while the shape of the IE function 
does coincide with that previously re- 
ported (Gagné, 1950; Gibson, 1942; 
Underwood & Goad, 1951) for S gen- 
eralization, it not coincide with 
S generalization more logically defined. 
For the final two conditions, H,’H2’H; 
and H,’H.’H;’, with randomly paired Ss 


seen 


two 


does 


and Rs, the response categorization 
with the present materials 
yields separate values of Hs; and of 
H;- Ho, but not of H; and He separately, 
as discussed above. Hence, it is not 
possible to test the hypotheses regarding 
S discrimination with the results from 
these conditions. The hypotheses re- 
garding IE, however, can be tested since 
for this a composite value for H.-H: 
suffices. 

Since in Cond. H,'H.’Hs, initial 
H;=1.00, the predicted IE=1.00— A—E. 
The values for IE so calculated neces- 
sarily coincide with the obtained values 
since there are no category B or D re- 
sponses in this condition. The practice 
trend shows a brief initial rise followed 
by a subsequent decline (see Fig. 4). 
Here again it can be seen that intrusion 
errors do not furnish a valid measure 
for stimulus generalization: first, be- 
cause it is influenced also by omissions 
(which account for the brief initial 
rise); and secondly, because aside from 
the omissions which become infrequent 
after the first few trial intervals, the 


possible 
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IE = 1.00 — H,-He: while stimulus gen- 
eralization = 1.00 — H;. Hence, since 
in this condition initial He is less than 
1.00, IE overestimates the amount of 
stimulus generalization. 

In Cond. H,’H2’H;’, the ‘‘corrected”’ 
equation is again used to determine the 
predicted IE scores. These calculated 
scores are shown, together with the 
obtained scores, in Fig. 4. Here again 
is seen the initial tendency to rise, 
followed by a leveling off and perhaps 
slight decline as practice continues. 
There is close agreement between the 
magnitudes and the trends of predicted 
and obtained scores (F = 0.70). In this 
condition the invalidity of using the 
IE score as a measure of stimulus 
generalization is more obvious still, since 
it is determined not only by H, but by 
He, H3, and the number of omissions as 
well. 

Hence, the present model has proved 
adequate to predict the IE curves over a 
wide range of conditions. It has also 
shown that stimulus generalization falls 
monotonically during paired-associate 
learning and that intrusion errors con- 
stitute a valid measure of S generaliza- 
tion only when initial H; = 1.00; other- 
wise, the IE curve tends to rise at first 
and then fall with continued practice. 


SUMMARY 


A stimulus-response analysis of paired- 
associate learning is described, postulating 
that the learning of each of the pairs involves 
the formation of three different connections: 
(a) between the stimulus member of the pair 
and a mediating, labeling response which 
discriminates it from the stimulus members 
of the other pairs; (b) between the stimulus 
produced by that labeling response and the 
response member of the pair; and (c) between 
the stimuli produced by each successive 
subelement of the response member and the 
following subelement. A quantitative model 
which relates the strength of these component 
habit connections to various aspects of per- 
formance on paired associates is described. 

Deductions from the model were tested by 
means of an experiment involving 10 Ss in 
each of six conditions. The antecedent vari- 
ables were the difficulties of forming each of 
the three component habits described above, 
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manipulated by using different stimulus 
members and response members in the pairs, 
and by assigning the responses to the stimuli 
in varying ways. With these materials it was 
possible to analyze the responses given by the 
Ss so as to obtain separate measures of the 
strength of the three component habits during 
learning. A typical paired-associate learning 
situation was employed, using a Hull memory 
apparatus and the anticipatory method of 
performance. 

It was found that the relations between 
the gross learning scores in the different 
conditions predicted from the model agreed 
closely with the obtained results in direction 
and magnitude. The obtained practice curves 
for intrusion errors agreed closely over a wide 
variety of conditions with those predicted 
from the model which postulates that the 
number of intrusions is a function of the 
strengths of all three of the habits described 
above, and not just of that of the discrimina- 
tion habits as assumed in some studies. The 
intrusion error curves tended to first rise, 
then fall slightly with practice under most 
conditions. The practice curve for stimulus 
generalization during paired-associate learn- 
ing was found to have a monotonically falling 
slope rather than the first rising, then falling 
shape sometimes inferred. 
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Probably everyone has some hunches 
on how one taste quality affects 
another, on how sugar affects the 
taste of salt, on how acid affects the 
perception of bitterness, and the like, 
but many of these assertions are 
contradictory. Probably some of 
this disagreement is due to people’s 
confusing changes in intensity of the 
individual taste qualities with the 
more ambiguous tastes created by 
the addition of one unitary taste 
stimulus to another. Thus, it may 
be that a salt-and-sugar solution pro- 
duces relatively unclear taste sensa- 
tions even though the _ individual 
qualities, when observed with a more 
analytic set, have the same subjective 
intensities as they would have in a 
pure salt or pure sugar solution. 


No systematic investigation of taste inter- 
actions at suprathreshold stimulus intensities 
has ever been reported. Anderson (1950) 
has presented a critical review of the literature 
and the results of his own systematic study 
on interactions among stimuli at near-thresh- 
old concentrations. Fabian and Blum (1943) 
have summarized the early literature on taste 
interactions and reported their investigation 
of interactions between various sweet, salty, 
and sour substances; however, their emphasis 
was on the effect of a subthreshold concen- 
tration of one substance upon the perceived 
intensity of a suprathreshold concentration 
of another. They concluded that a sub- 
threshold concentration of salt (NaCl) in- 


! This paper reports research undertaken 
at the Quartermaster Food and Container 
Institute for the Armed Forces, and has been 
assigned Number 1093 in the series of papers 
approved for publication. The views or 
conclusions contained in this report are those 
of the authors. They are not to be construed 
as necessarily reflecting the views or indorse- 
ment of the Department of Defense. 


creased the intensity of five different sugars 
and reduced the sourness of five organic acids 
and one inorganic, hydrochloric acid (HCl). 
Each subthreshold concentration of the five 
sugars, in turn, reduced saltiness of NaCl 
and the sourness of the six acids. All organic 
acids enhanced saltiness, although HCl ap- 
peared to have no effect; but the effect of 
acids upon sweetness seemed to be in part a 
function of the specific acids and the specific 
sugars used. For example, the sweetness of 
fructose reduced by lactic, tartaric, 
acetic, and malic acids; but HCI and citric 
acid had no effect. In contrast, citric, lactic, 
and tartaric acids enhanced the sweetness of 
sucrose, but HCI and acetic acid seemed to 
have no effect. 

More recently, Beebe-Center, Rogers, 
Atkinson, and O’Connell (1959) have reported 
on the interactions between suprathreshold 
concentrations of NaCl and sucrose. Their 
major conclusion was that some enhancement 
of sweetness by salt was evident in the case 
of weak solutions, but the principal effect 
was one of masking. 


was 


For the purposes of the present 
study, we assumed the existence of 
four basic taste qualities—salt, sweet, 
sour, and bitter—and that the appro- 
priate stimulus for each is NaCl, 


sucrose, citric acid, and caffeine, 
respectively. The interactions in- 
vestigated were those between every 
pair of qualities. In each such pair, 
a given stimulus was studied both 
as to its effect on another and how 
it was affected by the other. Thus, 
mixtures of sucrose and NaCl were 
examined from two points of view: 
the effect of sucrose (secondary stim- 
ulus) upon the perceived intensity 
of the saltiness of NaCl (primary 
stimulus); and conversely, the effect 
of NaCl (now the secondary stimu- 
lus) upon the perceived intensity of 
sweetness of sucrose (now the pri- 
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TABLE 1 


PERCENTAGI 


CONCENTRATIONS OF SOLUTIONS USED IN STUDY OF INTERACTIONS 


OF TASTE QUALITIES 


Taste Quality 


Salt (NaCl 
Bitter (caffeine) 
Sweet (sucrose) 
Sour (citric acid) 


15 
031 
50 
.009 


45 
.076 

1.70 
.029 


have been corrected for the one molecule of water of crystallization. 


on the effect of sucrose on saltiness. 
centrations were .00, .45, 1.70, and 6.00. 
centrations were slightly revised for the subsequent ones. 


mary stimulus). In all, 12 sets of 


interactions are possible. 


METHOD 


Taste solutions.—Two series of each stim- 
ulus were prepared, a primary (‘‘effect on’’) 
and a secondary (“‘effect of’). The concen- 
trations in the primary series were intended 
to cover the range of intensities from barely 
perceptible to almost extreme. The concen- 
trations in the secondary series were intended 
to result in perceived intensities from none to 
moderate, since it was thought that interaction 
effects would be most easily demonstrated 
if the four concentrations of the secondary 
series were of generally lower intensity than 
the primary series. Selections of the specific 
concentrations within these ranges were based 
upon previous unpublished data. 

All concentrations are shown in Table 1 
and represent the number of grams of the 
solute per 100 ml. of solution. The weights 
for citric acid have been corrected for the one 
molecule of water of crystallization per citric 
acid molecule. It will be noted that, apart 
from the 0% concentrations of the secondary 
stimulus, the concentrations in both 
are approximately logarithmically 
The sucrose and citric acid were 
Reagent, the NaCl was Merck C. P., and the 
caffeine was Pfizer U. S. P. Charcoal-filtered 
distilled water was always used as the solvent. 

In each interaction experiment, the 16 
solutions consisted of each concéntration in 
a given primary series with each concentration 
in one of the three remaining secondary ones 

Table 1). For example, in the experiment 
on the effect of citric acid upon the perceived 
intensity of bitterness, the primary stimulus 
(.031%, .076%, .195% 

combination with the 


series 
spaced. 


Merck 


series of caffeine 


500%) is used in 


Primary Series 
(Rated for Intensity) 


1.40 
195 

5.80 
.089 


The NaCl concentrations were .10, .35, 


Secondary Series 
(Added to Solutions) 


.00 
.00 
.00 
.00 


13 44 
048 


1.90 


1.50 
093 
8.00 


An exception occurred in the experiment 


20, and 4.00; and the sucrose con- 


This interaction experiment was the first one conducted, and the con- 


secondary series of citric acid (.00°%, 
.023° , .073%), as shown in Table 2 

Experimental design.—Each interaction 
experiment was independently replicated, the 
interval between replications varying from 
1 wk. to16mo. The basic experimental design 
is given in Table 2, where the experiment 
on the effects of citric acid upon bitterness 
is shown as an example. 

The levels of the primary stimulus, caffeine, 
and the secondary stimulus, citric acid, were 
taken from Table 1. The 16 solutions were 
divided into two sets. Half the judges (Os) 
in each replication evaluated the solutions 
marked “O”’ in Table 2, while the other half 
evaluated the solutions marked “X."? 

Judges.—The Os were selected from a pool 
of approximately 700 civilian and military, 
male and female, employees who routinely 
participate in preference tests of foods though 


.007%, 


* The design is a half-replicate in which 
the interaction of the “‘linear’’ component 
of the primary stimulus and the 
component of the secondary is confounded 
with Judge-Group. Quotation marks are 
used to indicate that these components are 
not linear or cubic in the quantitative sense, 
but rather involve comparisons between pairs 
of levels. Thus, the “linear’’ (Component I) 
of caffeine, when used as the primary stimu- 
lus, means that the average perceived bitter- 
ness of the eight solutions in the first two 
caffeine levels is compared with the average 
of all eight solutions in the second two levels 
Similarly, the “cubic’’ (Component III) of 
caffeine compares the average perceived 
intensity of the eight solutions in the first 
and third levels with the average of the eight 
solutions in the second and fourth rhe 
quadratic (Component IT) is a true quadratic 
and _ involves the middle two 
levels against the lowest and highest ones. 


“cubic” 


comparing 
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TABLE 2 


EXPERIMENTAL DESIGN EXAMPLE: EFFECTS 
oF Citric ACID UPON THE PERCEIVED 
INTENSITY OF BITTERNESS 


Levels of Caffeine 
Levels of Citric Acid (Primary Stimulus) 
(Secondary 


Stimulus) 


A on 2s - 
031% | .076% | .195% | .500% 


1-.00% C xX oO 
I1-.007% X 
111-.023% : O | 
1V-.073% X } 


O 





Note.—Solutions marked “X" were evaluated by 
half the Os, and the solutions marked “‘O”’ were evalu- 
ated by the other half. Thus, the interaction of the 
“linear” component of caffeine and the “‘cubic’’ com- 
ponent of citric acid are confounded with Judge-Group. 


rarely in psychophysical investigations. In- 
dependent selections of 40 Os were made from 
the pool for each replication of each experi- 
ment. Departures from randomness occurred 
when some were absent or were otherwise 
not available on the days the tests were 
conducted. Because 960 persons were re- 
quired (12 interactions X 40 Os X 2 replica- 
tions) and because replacement into the pool 
followed selection, some Os participated in 
more than one experiment or replication. 
Psychophysical method.—The single stim- 
ulus method was used with a nine-interval 
rating of intensity. Alternate intervals were 
anchored with the following descriptions of 
intensity: none, slight, moderate, strong, and 
extreme. The intervals were assigned succes- 
sive integers from 1 (none) to 9 (extreme) and 
the ratings then treated quantitatively. The 
Os were instructed to rate the intensity of the 
quality represented by the primary stimulus, 
ignoring other qualities that might be present. 
They were told not to swallow the samples. 
Fresh solutions were prepared for each 
replication session. A session was usually 
completed in 1 day although occasionally 
2 consecutive days were needed to test the 
40 Os. Each O sat in a semi-enclosed testing 
booth. They were presented one at a time 
with 6-ml. samples in coded 1-oz. glasses 
through a turntable in a wall separating the 
booth from the serving area. After rating 
each sample, O rinsed his mouth ad lib. with 
charcoal-filtered distilled water. The time 
between the rating of one solution and the 
presentation of the next was 30 sec. During 
the course of the experiments, the question 
arose as to whether these untrained and 
unscreened Os were aware of the characteris- 


tics of a sour or bitter substance and the 
distinctions between them. It was decided 
that on the second replication of each inter- 
action experiment, O would receive a reference 
sample of the primary stimulus prior to 
rating the other eight and would be asked 
to note carefully its flavor without rating it. 
The reference sample was always a pure 
solution of the second highest concentration 
of the primary stimulus. In the analyses of 
variance, session was a source of variation 
although it is a generic term that includes 
ordinary session variability per se, whether 
or not a reference sample was served, actual 
differences among judge groups, etc. 


RESULTS 


A separate analysis of variance was 
performed for each taste interaction. 
The total variation was partitioned 
among the following sources of varia- 
tion: each orthogonal component of 
the primary and of the secondary 
stimulus, the interaction of every 
component of the primary with every 
component of the secondary stimulus, 
session, interaction of session with 
each orthogonal component and with 
ach primary-secondary interaction, 
judge, and the interaction of session 
and solution. Each source except 
the last, had 1 df. There were 76 df 
for Judge and 532 for Judge X Solu- 
tion. The .01 level was chosen as the 
criterion for significance.* 

Except for those sources of variation 
which were confounded with Judge- 
Group, the error term was Judge- 
Solution interaction (within groups). 
Those sources, for which variation 

3 Four tables showing the mean ratings of 
each solution and one table showing the 
sources of variation and their corresponding 
df’s, mean squares, and levels of significance 
have been deposited with the American Docu- 
mentation Institute. Order Document No. 
6777 from ADI Auxiliary Publications Proj- 
ect, Photoduplication Service, Library of 
Congress; Washington 25, D. C., remitting 
in advance $1.25 for microfilm or $1.25 for 
photocopies. Make checks payable to: 
Chief Photoduplication service, Library of 
Congress. 
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among Judges (within groups) was the 
error term, were Session, the inter- 
action of Component | of the sec- 
ondary stimulus and Component II] 
of the primary, and the three-factor 
interaction of Session X Secondary-I 
X Primary-I11. 

As would be expected, Component I 
of the primary stimulus was in each 
case significant, far beyond the .001 
level. In 21 of 24 cases, Components 
II and III were also highly significant. 
Inspection of the ratings’ shows 
that the major effects of increasing 
the primary stimulus concentrations 
were true linear, although significant 
departures did occur.* 

Each mean was based upon 40 
ratings, 20 in each of the two replica- 
tions. The mean ratings of each of 
the 16 solutions in each interaction 
experiment are plotted in Fig. 1. The 
results and conclusions will be dis- 
cussed separately for each interaction. 
The general error term, Judge-Solu- 
tion interaction, will be indicated in 
parentheses. 


Effects of Caffeine 


Upon saltiness.—No significant effects 
of caffeine upon saltiness were found. 
Regardless of the level of caffeine, salti- 
ness was merely a function of the salt 
itself. There was only a slight suggestion 
that if the caffeine level were increased 
even further, saltiness might eventually 
be enhanced. (Error MS = 1.39.) 

Upon sweetness.—No variables affected 
sweetness other than the sucrose con- 
centrations themselves. However, Caffe- 
ine-I was almost significant; the ob- 
served F was 6.50, compared to an F 


‘If orthogonal polynomials are used on 
the mean ratings (Anderson & Bancroft, 
1952), it will be found that the true linear 
is greater than Component I, with a cor- 
responding decrease in the mean square for 
the true cubic. This effect is due to the fact 
that although Component I is mostly linear, 
it does reflect some cubic; Component ITI, 
in turn, contains some linear. 
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of 6.64 required for significance at the .01 
level. Only a suggestion of masking 
by caffeine was present, the difference 
between the lower two and higher two 
caffeine concentrations being only .26 
scale points. Higher levels of caffeine 
might demonstrate eventual masking 
of sweetness. (Error MS = 1.70.) 
Upon sourness—Components I and 
III of caffeine were each significant at 
the .001 level. Inspection of the mean 
intensity ratings reveals that the effect 
was one of enhancement, with no other 
significant sources of 
plicating the 
MS = 2.94.) 


Effects of NaCl 


Upon bitterness.—No significant effects 
were found, other than those attributable 
to caffeine. The curves were somewhat 
jagged and suggested a Caffeine-I X Salt- 
III interaction. However, the interac- 
tion of these two components was con- 
founded with Judge-Group, and hence 
the variation among Os was used as the 
appropriate error term. (Both Judge- 


variation 
interpretation. 


com- 
Error 


Solution and between-Judge MSs were 


unusually high, 3.68 and 9.63, respec- 
tively.) Despite the fact that the same 
types of curves emerged in the two 
replications and that there was a high 
mean square for the interaction, use of 
the large error term (between-Judge 
rather than Judge X Solution) worked 
against obtaining a significant F. In 
view of the magnitude of the mean 
square for this source of variation, a 
logical next step would be to replicate this 
entire experiment with a different set of 
confounding relationships, e.g., Primary- 
Ili1XSecondary-II. (Error MS=3.68.) 

Upon sweetness—Salt Components I 
and II were significant at the .01 level. 
This is interpreted to mean that salt 
generally tends to mask sweetness, and 
there is some curvature in the effects. 
The interpretation is complicated by two 
interactions which were also significant 
at the .01 level: (a) The first is the 
Salt-I X Sucrose-I interaction. For the 
lower sucrose concentrations, the various 
levels of salt had relatively little effect. 


Instead, the reduction of sweetness 
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CONCENTRATION OF SECONDARY STIMULUS 
Fic. 1. Summary of taste interactions. (The abscissa represents increasing concentrations 
of the secondary stimulus. The four curves in each graph are for the four levels of the primary 
stimulus whose taste quality is shown on each graph. See Table 1 for concentrations. Curve 
fitting was guided by the significance of the sources of variation in the analyses of variance, 
rather than by least squares methods.) 
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occurred primarily for solutions that 
were high in both salt and sucrose. (b) 
The second is the Salt-II XK Sucrose-I 
interaction. Again, for the lower sucrose 
concentrations, the various levels of salt 
had relatively little differential effect on 
sweetness. The greatest effect was on 
the higher sucrose concentrations by the 
highest salt concentrations, with a lesser 
effect by the lower salt concentrations. 
In both of these interactions, the very 
highest salt concentration had its great- 
est sweetness-depressing effects at the 
higher concentrations. Thus, 
while the over-all effect of salt upon 
sweetness was one of masking, this did 


sucrose 


not occur or was not as pronounced 


for the lowest concentrations. 


In fact, for the very lowest sucrose, salt 


sucrose 


seemed to enhance sweetness. 

The results are consistent with the 
conclusion of Beebe-Center et al. (1959), 
that some 


enhancement of sweetness 


by salt does occur although the major 


effect was one of masking, and with the 
report by Fabian and Blum (1943) that 
near-threshold concentrations of salt 
enhance sweetness. 

Future research might well be devoted 
to more intensive study, at just above 
threshold salt concentrations, of sucrose 
concentrations and 6%, 
the region in which the shift from en- 
hancing to masking appears to occur. 
(Error MS = 2.24.) 


Upon sourness. 


between .5% 


The results of this 
experiment are more complex than those 
of the Apart from the 
acid itself, Component-II only of 
had an effect (P < .01); the 
and lowest (0%) salt concentrations 
had effect than the middle 
This result, however, should be viewed 


others. citric 
salt 
highest 
less ones. 
in the light of the significance of two 
interaction terms: Salt-I K Citric Acid- 
I (P < .001); Salt-I XK Citric Acid-ITI 
(P < .01). The latter may have arisen 
because Component III has some linear 
effects. The highest levels of salt 
tended to enhance the sourness of the 
lower concentrations of citric acid, but 
reduced the sourness of the higher acid 
concentrations. 

From these two significant sources of 


variation, from the fact that they ap- 
peared in both sessions, from the absence 
of other significant effects (except be- 
tween-judge variation), and from inspec- 
tion of the mean ratings, we can infer 
that low level salt depressed sourness, 
but high levels enhanced it so that an 
over-all effect apparent. The 
higher the acid concentrations the later 
these two 
increasing 
13% salt appeared to 
the sourness of the lowest 
concentrations, but as the was 
increased to .44%, enhancement took 
place; .44% salt had a depressing effect 
on the highest two levels of acid, but 
1.50% salt seemed to increase sourness. 


was not 
stages terms of 
Thus, 
reduced 
two 


appear—in 
salt concentrations. 
have 
acid 
salt 


No explanation of this nonmonotonic 
function isapparent. Quantitative chem- 
ical analyses of the solutions failed to 
reveal errors in making them up, nor 
did examination of the ratios of the 
normality of the salt to the normality 
of the acid suggest a chemical explana- 
tion. What is needed is an extended 
range of salt, a more detailed study of 
the points around the minimum sourness 
intensities produced by salt at each 
citric concentration, and use of 
other acids as stimuli. 


MS = 2.74.) 


acid 
sour (Error 
Effects of Sucrose 


Upon bitterness. 
intensity of 


Sucrose reduced the 
bitterness. Component | 
was significant at the .001 level, while 
Components II and III were significant 
at the .01 level. 


increasing 


Although successively 

concentrations con- 
sistently reduced bitterness, the highest 
concentration 


sucrose 


seemed to have a dis- 
proportionately large effect. 
Two solutions (.45% 


1.9% 


sucrose, .076% 
50% 


somewhat 


caffeine; caffeine) 


from 


sucrose, 
appeared to deviate 
the downward trend. The significance 
(P < .001) of the Primary-III X Sec- 
ondary-III X Session interaction largely 
reflects this deviancy and the fact that 
it occurred for only one session. Hence, 
its bearing on the major conclusion is 
negligible. 
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The Caffeine-I X Session interaction 
was also significant (P < .001), and was 
due to the fact that Os in Session 1 did 
not use the upper categories of the rating 
scale as often as did Os in Session 2; thus, 
the range of their average ratings was 
lower. This phenomenon, commonly 
appearing during psychophysical tests, 
is not considered important. 

Except for the two deviant solutions 
on one the effects of 
were fairly clear-cut—sucrose 
bitterness. (Error MS = 2.74.) 

Upon saltiness.—Sucrose had no gen- 
eral enhancing or masking effects on 
saltiness, a result in agreement with that 
reported by Beebe-Center et al. (1959). 
Also in agreement was the indication that 
the relationship might be somewhat com- 
plex. Thus, the interaction of Salt-III 
with Sucrose-III was significant (P<.01); 
and inspection of the mean ratings sug- 
gests that this may be attributable to the 
highest sucrose concentration (and to a 
lesser extent, the next-to-lowest sucrose 
concentration) rather sharply reducing 
the saltiness of the highest and next-to- 
the-lowest salt concentrations. Why 
this effect did not occur for the second 
highest salt concentration is not readily 
apparent. 

The Salt-II K Sucrose-I interaction 
was also significant (P < .01), but this 
effect seemed to occur primarily for one 
session. Because the triple interaction 
involving Session 
(P < .001), not 
attributed to the 
simple interaction. 

Apart from the absence of over-all 
enhancing or masking effects, the results 
of this experiment are not as definitive 
as those of the others. Further research 
should be devoted not only in replicating 
this one, but should also extend the 
sucrose concentrations to perhaps 15 or 
20%. (Error MS = 1.69.) 

Upon sourness.—All three components 
of sucrose were significant, Component 
II at the .01 level and the others at the 
.001 level. Sucrose reduced the intensity 
of sourness, a result in general agreement 
with that reported by Fabian and Blum 
(1943) and those cited by them. The 


session, sucrose 


reduced 


was also significant 


much importance is 
significance of the 
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sharpest drop in intensity occurred for 
the 6.00% sucrose concentration. If 
the range of sucrose concentrations were 
extended, the masking effects would 
probably be even more pronounced. 
(Error MS = 2.60.) 


Effects of Citric Acid 


Upon bitterness—Citric acid very 
markedly enhanced bitterness, as is 
demonstrated by the significance of all 
three components, I and III at the .001 
level, Component II at the .01 level. 
The steepest rise in bitterness was evi- 
dent between the highest and next 
highest citric acid concentrations. The 
Citric Acid-I X Caffeine-I interaction 
was also significant (P < .001). The 
acid had a proportionately greater effect 
upon the lower concentrations of caffeine 
than on the higher concentrations. This 
effect may be partly at least due to 
restriction of the rating scale at the 
highest caffeine concentrations. 

The Caffeine-I X Session interaction 
was significant (P < .001). In 
Session 2, Os used a narrower range in 
evaluating the solutions. No special 
importance is attributed to this result. 

The conclusions from this experi- 
ment are probably so clear that further 
study would not be as fruitful as with 
several of the other interactions. (Error 
MS = 2.79.) 

Upon saltiness.—Saltiness was gen- 
erally enhanced by citric acid, as shown 
by the significance (P < .001) of the 
Citric Acid-I component. The enhance- 
ment has not been shown to be dependent 
upon the level of salt. The absolute 
increase was not very marked, being only 
.32 scale points between the lowest two 
and highest two citric acid concentra- 
tions; but, the error term is the lowest 
of all interaction experiments. 


also 


Salt-I X Session interaction was also 
significant at the .001 level. The Os in 
Session 2 tended to restrict their range 
of ratings, this restriction being mani- 
fested more at the higher levels. As 
in the previous interaction, this finding 
is not of any special importance. 

Extending the range of citric acid 
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concentrations might reveal a drop-off 
in enhancement, particularly for the 
lower salt concentrations, and perhaps 
an eventual masking for the higher salt 
(Error MS = 1.47.) 
Upon sweetness.—Citric acid generally 
increased (P < .01). This 
major conclusion is in agreement with 
that of Fabian and Blum (1943), who 
used only near-threshold concentrations 
of citric acid. 


concentrations. 


sweetness 


The only other significant 
(P < .01) source of variation, Primary-! 
X Session, means that in Session 1, Os 
rated the solutions containing the weak- 
est concentrations of sucrose higher than 
did Os in Session 2. Thus, the range of 
ratings was more restricted on Session 1 
than on Session 2. 

Extending the range of citric acid 
concentrations should aid in better de- 
fining the mathematical relationship 
between citric acid and perceived sweet- 
ness, particularly in determining whether 
the increase finally levels off and per- 
haps changes to a 
MS = 1.98.) 


decrease. (Error 
DISCUSSION 
In most experiments the results were 


clear-cut, and the functional relation- 


ships between the primary and secondary 


stimuli were either 


existent. 


monotonic or non- 
Where ambiguities or hints 
of a trend with different stimulus con- 
centrations appeared, recommendations 
for follow-up research were indicated. 
No secondary stimulus had a uniformly 
enhancing or depressing effect on the 
remaining three primaries; nor was any 
primary uniformly enhanced or de- 
pressed by other secondaries. Also, 
what happened at near-threshold stim- 
ulus concentrations was not necessarily 
predictive of suprathreshold phenomena. 

The general results from the “linear” 
comparisons may be summarized as 
Caffeine does not appear 
to affect saltiness, nor does salt appear 
to affect bitterness. (b) Caffeine does 
not seem to increase or decrease sweet- 


follows: (a) 


ness, but sucrose depresses the perceived 
intensity of 


bitterness. (c) Caffeine 
and citric acid have a mutually enhanc- 
ing effect upon the taste quality specific 


SUPRATHRESHOLD 


PASTE STIMULI 355 


to each. 
but 


(d) Salt decreases sweetness, 
sucrose does not 
(e) Salt seems to 
effect upon sourness, 
citric acid increases saltiness. 


affect 
have no 
but 
(f) Sucrose 


citric 


appear to 
saltiness. 
monotonic 
sourness, but acid 
enhances sweetness. 


decreases 


Conspicuously absent from this paper 
is reference to physiological correlates 
of taste perception. 
that four primary 


The assumption 
taste qualities exist 
does not imply that four types of recep- 
tors alsoexist. Pfaffmann (1958), in his 
study of electrical recording of nerve 
impulses from single taste nerve fibers 
of the rat, was unable to find complete 
specificity of receptor action. For ex- 
ample, NaCl and sugar activated the 
same sensory nerve fiber and its attached 
sense endings. The possible afferent 
discharge patterns, Pfaffmann concluded, 
may include not only an increase but a 
decrease in neural flow and that the 
primary taste qualities represent nodal 
points in the manifold of taste sensations 
rather than basic receptor types. 

The taste interactions reported here 
probably have a neurological basis, and 
the complexity of some of the results 
of the interaction experiments (e.g., 
NaCl-sucrose interactions) may reflect 
complex neural patterns such as those 
described by Pfaffmann. 

The design, method, and Os _ used 
here were not typical of those used in 
most psychophysical research. One in- 
trinsic disadvantage of the half-replicate 
design with specified confounding rela- 
tionships is that no O evaluates all solu- 
tions, so that the error terms for a few 
sources of variation are larger than in a 
full factorial design and so that irregulari- 
ties in curves can be attributed either to 
differences in the two groups or to “real” 
effects. Balanced against this weak 
point is the efficiency of the design and 
the single stimulus method (cf., match- 
ing method) in minimizing testing time 
of Os and in avoiding loss of motivation 
through testing of all solutions at one 
sitting. It was not feasible to have the 
same Os return to a second session to 
complete the ratings of the eight samples 
they did not test during the first. 
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Nevertheless, in view of suggestions of 
significance of the 
confounded with 


variable 
Judge- 


possible 


which was 


Group, a different confounding relation- 
have proved to be 


ship might more 
revealing. 

The relative difficulty in rating—the 
ambiguity of the sensations—is partially 
reflected in the magnitude of the error 
terms. Saltiness seemed to be the 
easiest to evaluate, insofar as Os tended 
to agree more with each other on this 
quality than on the others. Sweetness 
but and 
sourness had errors of the order of twice 
that of saltiness. A design similar to the 
used here would be too 
to employ for interactions among three 
or four stimuli, but the present data 
should useful in selecting the 
optimum concentrations for exploring 
the relationships among 
more than two stimuli. 


was next lowest, bitterness 


one laborious 


prove 


mixtures of 


SUMMARY 


Twelve experiments were conducted to 
determine how each unitary taste quality is 
affected by each of the other taste qualities 
Solutions containing stimuli appropriate to 
both taste qualities were rated for intensity 
of sweetness, saltiness, sourness, or bitterness 
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by a group of judges. Each experiment was 
independently replicated. In most 
the effects were those of simple enhancement 
or masking, or no effect at all was found. 
Certain exceptions and complex relationships 
occurred, and recommendations for follow-up 
research were made. Various aspects of the 
method and design were discussed. 


cases, 
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DIFFERENTIAL COST, GAIN, AND RELATIVE FREQUENCY 
OF REWARD IN A SEQUENTIAL CHOICE 
SITUATION ! 


JEROME L. MYERS, RAYMOND E. 


REILLY, anp HARVEY A. TAUB 


University of Massachusetts 


In a recent experiment Taub and 
Myers (1961) studied the effects of 
differential gain in a two-choice situa- 
tion. The gain received by S de- 
pended on which of two events was 
correctly predicted (the less frequent 
event paid more); the cost of an 
incorrect prediction was the same, 
regardless of which event had been 
predicted. The Ss did not approxi- 
mate the optimal strategy? of always 
predicting the event associated with 
the higher expected value (EV). 
That Ss could discriminate between 
EVs was indicated by two results: 
(a) the event with the higher EV 


was predicted more than 50% of the 


time in all combinations of conditions, 
and (b) the greater the difference in 
EVs (AEV) for two events, the 
greater was the difference in per- 
centage prediction of the two events. 

The present study extends Taub 
and Myers’ investigation by intro- 
ducing the variable of cost ratio, i.e., 
the ratio of the loss associated with an 
incorrect prediction of one event rela- 
tive to the loss associated with an 
incorrect prediction of the alternative 
event. Interactions among the effects 
of cost, gain, and frequency are of 
special interest, as are the relative 
effects of gain and cost upon perform- 
ance. The relation of choice behavior 


1 This experiment was supported by funds 
provided by the United States Navy Training 
Device Center, Port Washington, New York, 
under Contract 61339-5388. 

2Optimal strategy here refers to 
sequence of choices which maximizes 
expected monetary payoff. : 


that 
the 


to the EVs associated with 


will be considered. 


events 


METHOD 


Apparatus—The apparatus has_ been 
described in detail elsewhere (Taub & Myers, 
1961). Four panels, each 8$ in. high and 7 in 
wide were mounted at a 45° angle away from 
S. Each panel contained four lights and two 
switches, arranged in two columns. The 
upper two lights were green and constituted 
the informing lights. The lower two lights 
were amber and indicated S’s choice on each 
trial. The choice lights were controlled by 
the two switches operated by S. 

The informing light sequence was con- 
structed from a table of random numbers 
subject to the following restrictions: 
each light appeared a designated percentage 
of the time; (6) only one light was correct 
on each trial. The sequence was programed 
by a Western Union Tape Transmitter 

Subjects.—The Ss were 216 undergraduate 
volunteers attending the University of 
Massachusetts. None had previously par- 
ticipated in an experiment of this sort. 

Procedure.—Each S was given 100 chips, 
worth $¢ each, at the beginning of the session 
The Ss were instructed in the use of the choice 
switches, and in the purpose of the experi- 
ment. They were told how much a correct 
choice of either light would pay, and how 
much an incorrect choice of either light would 
cost. Chips were cashed in at the end of 
150 trials. 

The Ss were assigned randomly to 27 
groups of 8, each serving under a different 
combination of relative frequency, gain, and 
cost ratios. The relative frequencies (per- 
centages of occurrence of the two informing 
lights) were 90-10, 70-30, and 50-50. Gain 
ratios were 1 chip to 1 (1:1), 1 chip to 2 (1:2), 
and 1 chip to 4 (1:4). Cost ratios were the 
same as these. In all cases the light with the 
lower frequency of occurrence yielded the 
higher gain and the lower cost. Under the 
50-50 conditions the light with the higher 
gain had the lower cost. In each of the 27 


(a) 
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cells, 4 Ss saw the low gain light in the right 
panel position, and 4 Ss saw it in the left 
panel position. 


RESULTS AND DISCUSSION 


The mean percentages of prediction 
in the last 50 trials of the high fre- 
quency, high cost, 1-chip gain light 
(E;) are presented in the column 
labeled » in Table 1. The number of 
choices of an event increases as rela- 
tive frequency increases, as cost 
decreases, and as the gain associated 
with the alternative decreases. The 


TABLE 1 
EXPECTED VALUE (EV), DIFFERENCES IN 
EXPECTED VALUE (AEV), AND PER- 
CENTAGE CHOICE OF E,; IN 








Cost Gain 
Ratio | Ratio 


Relative | 
Frequency| 

50-50 
30 


.00 |—1.50 | 32.75 
40} —.10} 45.75 
80} 1.30 | 86.00 


50-50 
70-30 
90-10 
50-50 
70-30 
90-10 


—.50| —.50| 27.75 
10} .50| 76.25 
70} 1.50! 91.25 


—.50 |—1.00 | 22.50 
10} = .20| 59.75 
.70\ 1.40 | 90.25 


50-50 
70-30 
90-10 


| | 
okekecll Ueckedeal 
Www | KWNKDN |} 


NmNwN 


—~.50 |—2.00 | 33.75 
.10| —.40} 54.25 
70 | 1.20 | 91.50 
50-50 |—1.50|—1.50| 11.00 
70-30 | — .56 | —.10)} 65.25 
90-10 | .50| 1.30/ 91.00 
50-50 
70—30 
90-10 


50-50 | 
70-30 | 
90-10 | 


—1.50 |—2.00 | 17.75 
—.50| —.40| 46.75 
.50| 1.20 | 89.00 
—1.50 | —3.00 | 15.50 
—.50 | —1.00 | 32.50 
50| 1.00 | 81.75 


50-50 
70-30 
90-10 


Pree | RONMN!] — 
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data from the 70-30 and 90-10 
groups involving 1:1 gain and cost 
ratios suggest that Estes’ matching 
solution for the noncontingent case 
is unsatisfactory under monetary in- 
centive conditions. Although this 
solution has been verified in a number 
of experiments not involving mone- 
tary incentives (e.g., Estes & Straug- 
han, 1954; Grant, Hake, & Hornseth, 
1951), several other studies using 
incentive have resulted in data similar 
to those of the two groups under dis- 
cussion in the present study (Ed- 
wards, 1956; Goodnow, 1955; Siegel 
& Goldstein, 1959; Taub & Myers, 
1961). All these studies which used 
incentive obtained p values which 
were close together at comparable 
levels of relative frequencies. For 
example, Goodnow, Edwards, and 
the present investigators all report 
p values for x = .70, (for 100 to 150 
trials) between .80 and .83. Thus 
it appears that even the small mone- 
tary incentives used in the present 
study are sufficient to evoke choice 
probabilities above matching. An 
alternative explanation is that the 
experiments cited provided not only 
incentive, but also a form of feedback 
(increasing and decreasing piles of 
chips or coins) lacking in experiments 
which yielded matching. The relative 
contributions of these two factors of 
motivation and feedback to choice 
behavior remain to be explored. 

The optimal solutions, in terms of 
maximizing monetary payoffs, would 
be always to predict the light with 
the higher EV. At the end of 150 
trials, few Ss are utilizing this solution. 
The 50-50 groups’ failure to approach 
this solution more closely is particu- 
larly difficult to account for. Each 
of eight groups is faced with two 
equally frequent events, one of which 
pays as much or more and loses the 
same or less than its alternative. Yet 





SEQUENTIAL CHOICE 


TABLE 2 


ANALYSIS OF VARIANCE OF THE FREQUENCY 
OF CHOICE OF E, 


Source df 


Relative fre- 

quency (F) 
Cost (C) 
Gain (G) 
Fx C 
FXG 
CXG 
FxXCXG 
Error 


17,489.24 | 440.76*** 


1,298.44 
823.33 
119.48 
323.04 

54.56 
80.65 
39.68 


32.72°°° 

20.13°*° 
ce ag 
8.14*** 
1.38 
2.03* 


*P <.05. 
*P < .025. 
“P< .001. 


percentage choice of the low-EV 
light ranges from 10.5 to 35.5. It is 
not clear whether Ss expect the higher 
cost light to occur more frequently 
or whether they perceive the situation 
correctly but are incapable of reaching 
a rational decision, at least in 150 
trials. 

The results of an analysis of vari- 
ance of the frequency of choice 
appear in Table 2. All results were 
significant at the .01 level except 
Cost X Gain (P > .05), Cost X Fre- 
quency (P < .025), and Cost &K Gain 
X Frequency (P < .05). 


The extent and direction of main and 
interaction effects can be predicted by 
assuming a linear relation between AEV 
and p. If the mean AEV for the three 
levels of relative frequency of gain, 
and of cost are computed from the AEV 
column of Table 1, it is evident that (a) 
differences should exist among the mean 
p values for the levels of each of these 
variables, (6) gain and cost should con- 
tribute about equally to the total vari- 
ance (i.e., the variances of the mean 
AEV for these two variables are the 
same), and (c) frequency should con- 
tribute about nine times as much vari- 
ability to the data matrix as should 
gain or cost. Actually, cost seems to 
have a somewhat greater effect than gain, 
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and the frequency mean square is larger 
in relation to those of cost and gain 
than the variance of the AEVs would 
suggest. The mean AEVs for cost- 
frequency combinations yield the pre- 
diction that as the cost associated with 
E, increases, p should decrease, and that 
this decrease should be greater as relative 
frequency decreases. This in fact occurs, 
the greatest spread among the mean p 
values for level of cost being observed 
for the 50-50 relative frequency condi- 
tion. A similar prediction can be made 
for the Gain X Frequency interaction, 
that as the gain associated with the 
alternative increases, p should decrease, 
and that this decrease should be greater 
as relative frequency decreases. The 
70-30 frequency condition does show 
a greater variability among the gain 
means than does the 90-10 condition, 
but the three gain means are almost 
identical for the 50-50 condition. Com- 
paring the results for the three cost ratios 
with those for the three gain ratios at 
50-50 relative frequency leads to the 
conclusion that, at this level of frequency, 
increased cost associated with E; is more 
effective in decreasing choice of E; than 
is increased gain associated with the 
alternative. Table 1 substantiates this 
inference. Note these pairs of condi- 
tions: the 1:1 cost, 1:2 gain, 50-50 
condition, and the 1:2 cost, 1:1 gain, 
50-50 condition; the 1:1 cost, 1:4 gain, 
50-50 condition, and the 1:4 cost, 1:1 
gain, 50—50 condition; the 1:2 cost, 1:4 
gain, 50-50 condition, and the 1:4 cost, 
1:2 gain, 50-50 condition. Although 
the EVs are the same for the members 
of each pair, increased cost associated 
with E, yields a lower p value than 
increased gain on the alternative, in all 
three instances. 

Differences in EV do partially account 
for the rank order of p values and for 
some main and interaction effects. How- 
ever, the authors are not proposing a 
theory of choice behavior based on the 
assumption of a relationship between 
AEV and p. While the simple and 
familiar concept of EV does account for 
many relationships among variables and, 
in the absence of any more successful 
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formal model, provides some frame of 
reference within which to organize the 
data, it is evident tht AEV does not 
sufficiently account for the data ob- 
tained. The results of various 50-50 
conditions are, particularly deviant since 
the data consistently suggest that when 
two events occur equally often, the 
tendency to choose the one resulting in 
the smaller loss is much greater than the 
tendency to choose the event resulting 
in the larger gain. 


SUMMARY 


Gain, cost, and frequency ratios were 
manipulated for two events in a sequential 
choice situation with monetary incentives. 
Percentage prediction of the events was 
strongly influenced by these variables and 
by their interaction. The relation between 
percentage prediction of an event and 
the expected value associated with it was 
discussed. 
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THE EFFECT OF RECALL UPON RECOGNITION ! 


NELSON G. 


HANAWALT 


AND ARLENE G. TARR? 


Rutgers University 


Belbin (1950) published a paper 


which showed rather dramatically 
that recall had a depressing effect 
upon recognition. These experiments 
were inspired by a paper by Postman, 
Jenkins, and Postman (1948) in which 
recall and recognition were compared. 


Postman et al. (1948) gave two groups of 
Ss six trials on a list of 48 nonsense syllables. 
The syllables were read in random order for 
the different trials with instructions to re- 
member as many as possible. Immediately 
after learning Group I took a recognition test 
followed by a 10-min. recall period. 
II had the tests in the reverse order. 
nition first had a facilitating effect upon 
recall, which did not surprise the authors, 
but they were not prepared for the fact that 
recall first depressed the score on the recogni- 
tion test. The above experiment was not 
designed to test the effect of recall upon 
recognition, consequently the lapsed time for 
the two recognition tests was not controlled. 
[he authors assumed that the lower score 
for the recognition test after 10 min. of recall 
was due to weak traces becoming weaker 
during the 10-min. delay. 

Belbin’s (1950) experiments were designed 
specifically to test the effect of recall upon 
recognition. The learning was incidental and 
the method that of interpolated recall. In 
the first experiment 64 Ss spent 2 min. in a 
waiting room with a one-incident safety 
poster facing them on an otherwise blank 
wall, then half of the Ss took a ‘prompted by 
standardized questions” recall test, followed 
immediately by a recognition test. The 
control group engaged in an unrelated activity 
following the viewing, and took the recogni- 


Group 
Recog- 
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tion test after the same lapse of time. The 
recognition test was simply a presentation of a 
duplicate of the original poster with instruc- 
tions to say yes or no as to whether or not 
it was the same picture, regardless of the 
caption. For both Groups I and II, half of 
them saw a picture in recognition with a 
different caption. The caption made a dif- 
but under both conditions Ss who 
had recalled, overwhelmingly said that it was 
not the same, whereas the control group 
for the most part identified it correctly. In 
another experiment Belbin showed that the 
effect was still present after 3 hr. It was 
concluded that the chief factors responsible 
for the results were errors in recall 
tions and omissions. On the recognition test, 
the absence of some erroneously recalled 
detail or the presence of some nonrecalled 
detail seemed to determine the experimental 
group’s rejection of the picture as being the 
same. 

Kay and Skemp (1956), in some pre- 
liminary studies, repeated Belbin’s experiment 
with five groups of Ss, confirming her findings 
but with much less dramatic results. Experi- 
ment I was similar to Belbin's excepting that 
a complex picture (a many-incident park 
scene) used in the learning period. 
Belbin’s type of recognition test of the pic- 
ture as a whole used but a 30-item 
questionnaire based upon the incidents in the 
picture was added. Just how closely Belbin’s 
design followed is not clear but 
apparently a free written recall period was 
substituted for the “standardized questions.” 
The recall period came 10 min. after the 40- 
sec. viewing of the picture, with the recogni- 
tion test coming 40 min. later. With this 
material and method, the Belbin effect prac- 
tically disappeared for the identification of 
the picture as a whole, since all but 2 of the 
Ss who had recalled said it was the same. 
However, confirmatory evidence for the de- 
pressing effect of recall upon recognition was 
found in the 30-item questionnaire. On the 
10 items frequently recalled, the recall group 
did a little better, but on the 20 items less 
frequently recalled, the nonrecall group did 
nearly twice as well. 

In Exp. [I Kay and Skemp (1956) com- 
pared recalled vs. nonrecalled items of the 
same group of Ss rather than the performance 


ference, 


importa- 


was 


was 


was too 
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of a recall vs. a nonrecall group. In a study 
with preliminary to Exp. II, 
they showed that the number of reconstruc- 
tion errors was an important factor in recog- 
nition. However, as a result of Exp. II, 
which will not be presented here because the 
writers are unable to understand the exact 
procedure used, a new hypothesis was con- 
structed, namely, “‘that the juxtaposition of 
better and worse known items will raise the 
threshold of recognition of the latter’’ (p. 
159). In Exp. III some evidence in support 
of this hypothesis reported but the 
authors admit that the experiment was not 
well controlled. 


outline faces, 


was 


The present experiment was de- 
signed to test the effect of interpolated 
recall upon recognition with delayed 
recognition, different material, and 
an improved recognition test. It 
seemed well established that a de- 
pressing effect was present when 
recognition followed immediately after 
recall but there was a question about 
the effect upon delayed recognition. 


METHOD 


The original experiment was conducted in 
1958 with four groups of Ss, two experimental 
and two control, designated below as Sample 
A. In 1960 a replication was run with four 
additional groups designated as Sample B. 
All Ss took a true-false test as the learning 
task. Immediately following learning, Groups 
IA and IB and IIIA and IIIB, the experi- 
mental groups, were allowed 8 min. to recall 
as many as possible of the final adjectives on 
the T-F test. During the same time Groups$ 
IIA and IIB, and [VA and IVB, the control 
groups, participated in the usual class activity 
(lecture). Both samples of Groups I and II 
took the recognition test 8 min. after learning; 
Groups IIIA and IVA took the recognition 
test 48 hr. after learning; and, because of a 
change of schedule of classes between 1958 
and 1960, the B groups could not be tested 
after 48 hr., consequently Groups IIIB and 
IVB took their recognition test 52 hr. after 
learning. 

Subjects.—There were 102 Ss in the original 
and 83 in the replication experiment, all 
female and taking general psychology at 
Douglass College. The sections were assigned 
to different places in the design on the basis 
of convenience in testing since it was desirable 
to have the recognition tests come on the 
same day to reduce the possibility of discus- 


sion of the design of the experiment. 


AND ARLENE G. 
Neither of the 2 Es had anything to do with 
the teaching of the classes. 
task.—A learning device was 
desired which would appear to be a completed 
task since two of the groups would not be 
doing anything with the material immediately 
after the learning period—Groups IIA and 
IIB for 8 min., Group IVA for 48 hr., and 
Group IVB for 52 hr. A life-like situation 
was also desired. The test consisted of 23 
statements with a subject, copula, and final 
predicate adjective, e.g., “‘Hotel steaks are 
big,” or “Brown eggs are expensive.”” The 
statements were all ambiguous in order to 
minimize the effect of true or false, and to 
emphasize the final adjective which was to be 
recalled. The recall instructions were to 
write down as many as possible of the final 
adjectives of the 23 statements on the back 
of the test paper. As far as Ss were concerned, 
the T-F test was the end of the experiment. 
Recognition test—The recognition test 
contained only a five-choice test for the final 
adjective, numbered in the same order as 
the T-F test. A the correct 
choice and an opposite composed two of the 
choices. The instructions follow: 


We will now have a recognition test. 
From each group of five choices select the 
word which appeared at the end of each 
sentence and place its letter in the space 
provided. The choice groups are in the 
same order as they appeared in the sen- 
tences. If you are not sure which word is 
correct, guess. Answer every question. 


Learning 


synonym of 


RESULTS 


A comparison of the recognition 
scores of the experimental and the 
control groups is presented in Table 1. 
There is no evidence of a depressing 
effect of recall upon recognition since 
the recall groups all produced higher 
mean recognition scores than their 
respective control groups. After 8 
min. of recall Group IA had a sig- 
nificantly higher recognition score 
than IIA, but this difference did not 
reach significance in Sample B. The 
combined Samples A and B main- 
tained a significant difference at the 
5% level. When recognition was 
delayed 48 or 52 hr. there was a 


facilitation effect of recall upon recog- 
nition as shown by the significantly 
higher 


recognition scores of both 





EFFECT OF 


RECALL UPON RECOGNITION 


rABLE 1 


MEAN RECOGNITION SCORES 


Experimental (Recall) 


Test 


Group Interval 


Mean 


8 min. 15.86 


8 min. 


I1IA+B | 


*P< 0S. 
*P< 02 
“P< OO 


experimental samples over their re- 
spective control samples. After 52 
hr. the mean recognition scores were 
lower than after 48 hr. but the size 
of the difference between experimental 
and control groups was little changed. 
It is unclear from the present data 
whether or not the facilitating effect 
changes in degree between 48 and 
52 hr. 

Upon analysis of the results of their 
Exp. I, Kay and Skemp (1956) found 
that the superiority of their control 
group on the recognition test was 
limited to the 20 items not frequently 
recalled by their experimental group. 
A similar analysis was made of the 
present data. The 11 most frequently 
recalled words (65% recall) were 
compared with the 11 least frequently 
recalled (24% recall). In order to 
make this comparison the recall fre- 
quency was determined for each of the 
23 words. The Ss were then given 
two scores on recognition—one score 
for the words with a recall value above 
the median, and another score on the 
words with a recall value below the 
median. The median word was not 
scored. The percentage of recall of 
all the words was 44% for Sample A 
and 46% for Sample B, considerably 
greater than the 24% of Kay and 


Group 


‘Least frequently 


Control (Nonrecall) 


Test 


N 
Interval | 


8 min. 2.64** 
8 min. : 38 


2.18* 


48 hr. 
52 hr. 


4.97*** 


4.20*** 


aoe | 1655°"" 


Skemp’s experiment, but favorable 
for testing their hypothesis concern- 
ing the juxtaposition of better and 
worse known items. According to 
this hypothesis the recall group would 
be expected to do poorly on the less 
frequently recalled words. 

The percentage of recognition, Sam- 
ple A and B combined, of the two 
11-word lists, and the total list is 
presented in Table 2. It is interesting 


rABLE 2 


PERCENTAGE OF RECOGNITION ON THE TEST 
AS A WHOLE COMPARED TO THE 11 
Most FREQUENTLY AND THE 
11 Least FREQUENTLY 
RECALLED WorRDs 


After 48 or 


After 8 min 52 hr. 


Word Groups Vv 
. nee (Non- 
| Gecet (Recall)| recall) 
. FP N 5 | N =46 


All Words | 69% 
Most frequently | 74% 
recalled 


36° 
38% 


62% 36% 


recalled 


Note.—The percentages are proportions of scores 
rather than of discrete individuals, consequently the 
significance of the differences was based upon mean 
recognition scores and SDs. Percentages are used in 
the table in order to compare the results with those of 
Kay and Skemp (1956). 
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to note that the percentage of recog- 
nition for Group II on all words was 
and Skemp’s 
(1956) comparable group (64°7), but 
that the results for 
Group I, 69% to their 43.5%. On 
the frequently recalled the 
relative difference in recognition scores 


the same as for Kay 
were reversed 
words 


was similar in their experiment and 
in the present one, with the recall 
group doing better in both cases 
(88.2% to 80.6% compared to 74% 
and 64% of the present experiment). 
Groups | and II showed no significant 
difference on the infrequently recalled 
words, whereas Kay Skemp 
(1956) found that nonrecall 
group did much better over these 
items (50.6% compared to the recall 
group's 29.1%). It should be noted, 
also in disagreement with their re- 
sults, that the control group of the 
present experiment did no better on 
the frequently recalled words than on 
the infrequently recalled words at the 
8-min. interval. 


With delay 


and 
their 


in recognition (right 


half of Table 2), the recall group was 


significantly better than the non- 
recall group on all the words as well 
as for the frequently and infrequently 
recalled words. There is no evidence 
in Table 2 to indicate that recall has 
a depressing effect upon recognition, 
or that the better known 

inhibit the recognition of 
items. 


items 
worse known 


There is evidence for the facilitating 
effect of recall upon recognition which 
is centered on the frequently recalled 
words when recognition came 8 min. 
after recall, but a facilitating effect 
is present also on the infrequently 
recalled words after 48 to 52 hr. The 
24% correct recall apparently allowed 
for the strengthening of enough traces 
to provide a margin of superiority 
for this group on the recognition test 
over the nonrecall group which had 
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only the traces of the incidental learn- 
Since 
a chance recognition score is 20%, 
the nonrecall group apparently had 
few effective traces after the lapse 
of this longer period of time. 

The pattern of correlation coeffi- 
cients between recall and recognition 
adds further evidence to the increased 
influence of recall upon recognition 
with the lapse of time. When recog- 
nition followed immediately after 
recall the rho was .49 for Group IA 
and .46 for Group IB. When recogni- 
tion was delayed for 48 hr., Group 
111A, the rho was .60, and for Group 
I11B, after 52 hr., it was .70. 

Belbin (1950) and Kay and Skemp 
(1956) have demonstrated the effect 
of false recall upon recognition, conse- 
quently a detailed analysis was made 
of the present data in this respect, 
using Groups | and III and combining 
Samples A and B. In all there were 
201 false recalls classified as follows: 
66% synonyms, 19% intrusions from 

T-F sentences, 
9% opposites, and 6% unclassified. 
Fortunately 123 (61%) of the false 
recalls were on the recognition test 
as possible choices. This was due 
largely to the fact that a common 
synonym and an opposite of the cor- 
rect word had been included in each 
multiple choice group. Thus for the 
123 words the Ss had two learned 
responses—the correct one learned 
in the T-F test and a false one learned 
in recall. Which would they choose 
in recognition, the false recall or the 
correct word ? 


ing as a basis for response. 


the beginning of the 


The results of the above analysis 
are presented in Table 3. In recogni- 
tion immediately following the recall 
period, the false recall had two chances 
out of three of being selected, and 48 
or 52 hr. later three out of four 
chances. Since 80% of the total 
recall was correct, and only 61% of 
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TABLE 3 


RECOGNITION RESPONSES OF THE FALS!I 


RECALLS WHICH WERE ALso A CHOIC! 


ON THE RECOGNITION TEST 


Recogition 


( Tt 
pire Interval 


(A+B 8 
Hi(A+B 48 


min 
52 hr 


the incorrect recall was directly ef- 
fective because of the nature of the 
recognition test, the total effect of 
recall was positive for the present 
experiment. If erroneous recalls were 
more frequent than correct recalls, 
and if they had a chance to function 
directly on the recognition test, a 
depressing effect of recall upon recog- 
nition would be a reasonable outcome. 


The experiment of Postman et al. (1948) 
replicated with control for time interval.—Since 
Belbin (1950) and Kay and Skemp (1956) in- 
terpreted the results of the experiment of 
Postman et al. as evidence for the depressing 
effect of recall upon recognition, it was de- 
cided to repeat it with the time interval be- 
tween learning and recognition equalized 
Otherwise the experiment was replicated as 
closely as possible. The experimental (recall 
and the control (nonrecall) groups were 
lected at random from an advanced class in 
psychology 


se- 
junior and senior 
There were 16 Ss in each 
group compared to 35 for Postman et al. 
(1948 Following the learing period 
scribed above), 


composed _ of 
college women 


(de- 
the experimental group had 
10 min. for recall of the nonsense syllables; 
during a similar period the control group 
‘worked on pencil and paper mazes. Both 
groups took the recognition test 10 min. after 
the last learning trial. 

The experimental group had a mean recog- 
nition 24.69 (SD = 6.25 the 
control group 26.94 (SD = 4.85 Postman 
et al. (1948) reported a mean of 24.23 for the 
experimental group and 27.66 for the control 
group, a difference which was significant at 
the 1% level. The difference between the 
means of the experimental and the control 
groups in the present 
significant, consequently 


score of and 


experiment was not 


no statement can 
de- 
pressing effect of recall upon recognition. If 
the replication were a success, as was indicated 
by 


be made concerning a facilitating or a 


the close agreement in the means of the 


Chose False 
Recall 


oe | 
61", 


74° 


Chose Correct 
Response 


Chose Another 
Response 


Number of 
Responses 


330 , Le 
18% 8% 57 


experimental groups where the conditions 
were identical for the two experiments, it is 
reasonable to that the significant 
difference which Postman et al. reported was, 
as they suggested, due to the further weaken- 
ing of already weak traces during the 10 min. 
delay 


suppt sc 


DISCUSSION 


That correct recall may have a posi- 


tive, and incorrect recall a negative, 
effect upon recognition, was established 
by Zangwill (1937, 1939) and Hanawalt 
(1937), and confirmed in the results of 
the present experiment. 
(Hanawalt, 1958) 


earlier studies, 


A recent survey 
evaluated 
and other recent 
English experiments which pertain to 
the present problem, consequently dis- 
will limited to the studies 
outlined in the introduction of this paper, 
the only which 
all efiect 
recognition. 


has these 


some 


cussion be 


ones suggest 


of 


an 
recall 


over- 
depressing upon 

The modified replication of the Post- 
man et al. (1948) clarified to 
some extent the relation of their study 
to the When the time 
interval between learning and recognition 
was controlled, the evidence’ which 
Belbin (1950) and Kay and Skemp 
(1956) saw in it for the depressing effect 


study 


present one. 


of recall upon recognition disappeared. 
On the other hand, the failure of a 
facilitating effect of recall upon recogni- 
tion to appear makes the evidence for 
it at the 8-min. interval in the 
study more doubtful. It is quite obvious 
that there is neither a facilitating nor a 
depressing effect of recall present over 
all methods and material at this 
interval. No evidence discovered 


present 


time 
was 


in the replication which would require 
a change in the interpretation of results 
by the original authors. 
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The reason for the apparent contra- 
diction of the results of Belbin (1950) 
and of Kay and Skemp (1956) with those 
of the present experiment can scarcely 
be attributed to a different 
method since incidental 
common to all three, 
material since it was in each case 
meaningful and familiar. The chief 
differences between Belbin’s experiment 
and the present one were in the differ- 
ence in the learning situation, the recall 
activity, and in the type of recognition 
test, consequently these are the areas 
most likely to reveal an explanation of 
the difference in results. That the nature 
of the recall activity is important ap- 
pears to be born out by the failure of 
Kay and Skemp to reproduce Belbin’s 
results in their Exp. I on the identifica- 
tion of the picture as a whole. They 
accounted for this on the basis of an 
attitude change on the part of Ss because 
of the many-incident picture in contrast 
to Belbin’s single-incident picture. This 
was doubtless a factor but there was 
another important difference: Belbin 


learning 
learning was 
nor to stimulus 


used “‘prompted by standard question”’ 
recall where Kay and Skemp used free 


written recall. Belbin’s method of recall 
apparently facilitated the production of 
errors, and her recognition method the 
effective use of them. The importance 
of a detail on this type of test was 
exemplified by the responses of the 16 
Ss of her nonrecall group to the changed 
caption on the picture. Even though 
they were directed to disregard it, 7 of 
the 16 failed to identify the picture, 
which reduced this half of the control 
group to a chance performance. 

That the type of recognition test is 
important is borne out by the fact that 
Kay and Skemp did succeed in repro- 
ducing the Belbin effect with their 30- 
item questionnaire where they had 
failed on the picture as a whole. The 
method was essentially the same but 
S had 30 opportunities to say yes or no 
instead of one. Since the type of recog- 
nition was the outstanding difference in 
the design of Kay and Skemp’s Exp. I 
and the present one, it is possible that 
this was an important factor in the 
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divergence of results. In support of this 
possibility is the fact that the depressing 
effect of recall upon recognition appeared 
in neither the present experiment nor 
in the replication of the Postman et al. 
experiment, both of which used a 
multiple choice type of recognition test. 

Another factor of importance in the 
difference in results between Kay and 
Skemp’s Exp. I and the present one, is 
the difference in the percentage of correct 
recall: 24% compared to the present 
45%. Since there appears to be no 
disagreement concerning the facilitating 
effect of correct recall upon recognition, 
the higher percentage of correct recall 
of the present experiment would boost 
the scores of the recall groups. The 
facilitating effect for this time interval 
(Table 2) was limited to the 11 fre- 
quently recalled items. Of course a 
high error score would also produce a 
depressing effect. Since the percentage 
of error of Kay and Skemp’s experiment 
and the present one was approximately 
the same (21% to 20%) this may not 
have been too important. However, it 
must be remembered that 39% of the 
recall errors did not appear on the 
recognition blank of the present study 
and hence were probably ineffective. 
In the light of Belbin’s results an er- 
roneous recall appears to be much more 
effective on a recognition test of the type 
used by Kay and Skemp than on a mul- 
tiple choice type, but possibly lesseffective 
in determining the strength of faint 
memory traces. Factors aside from the 
specific trace of the learning period, such 
as attitude, inference, etc. would seem to 
be more important in the former. 

A further word should be said con- 
cerning the learning methods. Both 
Belbin and Kay and Skemp had Ss 
simply observe the pictures, while Post- 
man et al. and the present experiment 
required a specific reaction to each item. 
Kay and Skemp reported some intro- 
spective evidence indicating that many 
associations were to the picture as a 
whole. [t is possible that some details 
were not reacted to at all which would 
explain the low percentage“of recall. A 
fleeting reaction to other items would 





EFFECT OF 


tend to increase the error in recall and 
consequently, for this group, the error 
in recognition. 

The above discussion of the divergent 
results concerns chiefly the 8-min. recog- 


nition period, since the present experi- 


ment is the only one which tested after 
days. \t 
there 


time 
appears to be no doubt 
that recall has a facilitating effect under 


several this longer 


interval 


the conditions of the present experiment. 
The consistent with those 
of Hanawalt (1937, pp. 57-63) on 
memory for designs, 


results are 


where recall was 
found to have a facilitating effect for 
at least 8 wk. When learning is inci- 
dental, it is reasonable that the effect 
of recall should with delay in 
recognition, provided that there is more 
correct than incorrect recall, and the 
recognition test is one which favors the 
operation of traces of the learning period 
over general responses of the S. When 
recall and recognition follow immediately 
after learning, the traces are all relatively 
fresh so that traces 
recall have 
strength. 


increase 


strengthened by 
chance to their 
With the passing of time more 
of the incidental learning traces drop 
below functional strength for recognition, 
but many of the relatively stronger re- 
call traces, whether correct or incorrect, 
are still above threshold 


less show 


value, as was 
inferred above in connection with Table 2. 

Recognition, as Woodworth (1948, 
p. 81) pointed out for recall, is a problem 
to which S brings all of his 
responses,” 


‘available 
only some of which are 
based upon the traces of the learning 
period. Correct recall preceding recog- 
nition the and incorrect 
recall lowers it, but this is not the whole 
story. Even when the trace has been 
strengthened by correct recall, there are 
still erroneous recognitions, and correct 
recognitions (as many as one-third in 
the present results) when the trace has 
been weakened by erroneous recall. The 
S’s response on the recognition test is 
apparently influenced by subtle experi- 
mental factors, as suggested above, which 
bring variable general responses to bear 
upon the final decision. 


raises score, 


RECALL UPON 


RECOGNITION 


SUMMARY 


The effect of interpolated recall upon 
recognition was measured at different time 
intervals: immediately following recall and 
48 or 52 hr. after recall. The learning was 
incidental in the form of a true-false test 
and recognition was measured by a multiple 
choice test. Interpolated recall 
some equivocal evidence for a 


produced 
facilitating 
effect of recall upon recognition when recogni- 
tion followed immediately after recall; when 
recognition was delayed 48 or 52 hr., the 
facilitating effect was unequivocal \ test 
of the hypothesis that in recognition better 
known items have a depressing effect upon 
worse known items produced negative results 
Che effect of both and correct 
recall was found to increase with the lapse of 
time. The disagreement of the present results 
with some recent studies showing a depressing 
effect of recall upon recognition is discussed in 
the light of possible factors accounting for the 
differences. More research is needed before 
it will be possible to predict when recall will 
have a facilitating, a depressing, or no effect 
upon recognition 


erroneous 


REFERENCES 


Becsin, E. The influence of interpolated 
recall upon recognition. Quart. J. exp. 
Psychol., 1950, 2, 163-169. 

HANAWALT, N. G. Memory trace for figures 
in recall and recognition. Arch. Psychol., 
NY, 1937, No. 216. 

HANAWALT, N. G. 
SEWARD & J. P. 
psychological issues. 
1958. Pp. 53-85. 

Kay, H., & Skemp, R. Different thresholds 
for recognition: Further experiments on 
interpolated recall and recognition 
J. exp. Psychol., 1956, 8, 153-162. 

PosTMAN, L., JENKINS, W. O., & PosTMAN, 
D. L. An experimental comparison of 
active recall and recognition. Amer. J. 
Psychol., 1948, 61, 511-519. 

WoopworthH, R.S. Experimental psychology 
New York: Holt, 1948. 

ZANGWILL, O. L An investigation of the 
relationship between the process of repro- 


InG.S 
Current 


Holt, 


Remembering 
Seward (Eds.), 
New York: 


Quart 


ducing and recognizing simple figures with 
special reference to Koffka's trace theory 
Brit. J. Psychol., 1937, 27, 250-276. 

ZANGWILL, O. L. Some relations between 
reproducing and recognizing prose material 
Brit. J. Psychol., 1939, 29, 370-382 


(Received July 25, 1960) 





Journal of Experimental Psychology 
1961, Vol. 62, No. 4, 368-371 


THE RELATIVE EFFECTIVENESS OF POSITIVE AND 
NEGATIVE VERBAL REINFORCERS 


AUSTIN JONES! 


University of Pittsburgh 


Buchwald (1959a, 1959b) and Buss 
and his associates (Buss, Braden, 
Orgel, & Buss, 1956; Buss & Buss, 
1956; Buss, Wiener, & Buss, 1954) 
have studied acquisition and extinc- 
tion as a function of three different 
combinations of verbal reinforcers 
Right-Wrong, Nothing-Wrong, and 
Right-Nothing. It was uniformly 
found that acquisition is weaker 
under the Right-Nothing combination 
than under Nothing-Wrong and Right- 
Wrong. No significant differences 
have been found between the latter 
two. To account for the _ lesser 
efficacy of the Right-Nothing com- 
bination, Buss, Braden, Orgel, and 
Buss (1956) proposed that Nothing 
is a nonreinforcer and that Right is a 
relatively weak positive reinforcer, 
while Wrong is a relatively strong 
negative reinforcer. Buchwald (1959a) 
cites evidence in support of an alter- 
native explanation, ‘“‘that when S is 
exposed to either RN or to NW, N 
acquires a reinforcement value oppo- 
site in direction to that of the event 
with which it is combined, and that 
the value acquired by N in the NW 
combination exceeds that which it 
acquires in the RN combination” 
(p. 351). It should be noted that the 


two formulations are not mutually 


exclusive ; it is conceivable that Right 
may be shown to be a weaker rein- 


that 
simultaneously Nothing tends to ac- 


forcer than Wrong and also 


1 The author wishes to thank Arnold Buss 
and William Meyer for reading the manu- 
script, and H. Jean Wilkinson for assistance 
in the collection of data. 


quire a value opposite to that of the 
word with which it is paired. 

The purpose of the present study 
was to assist in the clarification of 
these issues by providing a direct 
test of the hypothesis that Right is 
a weaker reinforcer than Wrong. 
Each of the studies cited above pro- 
vides only inferential evidence on 
the question, since the two words 
were always arranged in a combina- 
tion, either with each other or with 
Nothing. In the present experiment, 
acquisition was studied as a function 
of ‘pure’ positive and negative 
reinforcements. Subjects were given 
but a single training trial and were 
reinforced by the single utterance of 
the word Right or Wrong. 


METHOD 


Subjects—In order to enhance the com- 
parability of this experiment to those cited 
above, two different S samples were employed 

the first a psychiatric group similar to that 
used by Buss, Braden, Orgel, and Buss (1956), 
the other an undergraduate college group 
similar to those used by Buchwald (1959a, 
1959b). The psychiatric group consisted of 
26 inpatients at a VA hospital, who served 
on a Most of them were 
under 30 years, and had received 12 or more 
years of education. Patients were excluded 
who had received shock treatment within the 
preceding 3 weeks, who were disoriented, 
or who had organic diagnoses. The college 
group consisted of 34 undergraduate volun- 
teers from the University of Pittsburgh. 

Learning materials —The task set for the 
Ss was to learn the association of the nonsense 
syllable DAK with the number concept “‘four”’ 
in the Wisconsin Card Sorting Test (Grant, 
1951). The WCST materials have been 
employed previously in the study by Buss and 
Buss (1956). In the present experiment, 64 


volunteer basis. 
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cards were used, with each color, form, and 
number of forms presented equally. A special 
training card was prepared which was of the 
same overall size and shape as the WCST 
cards. On it were drawn, in black ink on a 
white ground, four identical forms not repre- 
sented anywhere in the WCST. These forms 
could be described approximately as verticle 
rectangles with a crescent-shaped notch in 
the upper end. There are no black and white 
figures in the WCST; thus, the only formal 
element shared by the training card and the 
WCST series was that of ‘‘fourness.”’ 

Procedure—The Ss were tested indi- 
vidually. They were seated across a desk 
from E, who informed them that this was 
“‘an experiment to see how well people can 
solve problems when they have very little 
information to go on.”” They were told they 
would be shown a series of cards, some of 
which are called DAK, the others vEc, and 
that they should try to sort them into the 
appropriate categories. Before being handed 
the deck of 64 cards, Ss were shown the train- 
ing card and asked to guess its identity. If 
the S said DAK, E replied “Right’’; if S said 
vec, E replied “‘Wrong."’ The deck of cards 
was then sorted without further communica- 
tion between S and E. 


RESULTS 


Learning of the correct solution 
that DAK equals the number concept 
“‘four’’—would be reflected in a gen- 
eralization gradient 
creasing frequency of DAK responses 
over the categories 4, 3, 2, and 1 
of the WCST. Failure to learn the 
correct solution would be indicated 
by flat or increasing gradients. An 
analysis of variance comparing the 
gradients associated with the two 
S groups and two reinforcers yielded 
no significant differences. The S 
variability was very great, as was 
expected in light of the unstructured, 
ambiguous quality of a task in which 
performance has been guided by a 
single reinforcement. A nonvariance 
type of analysis was then selected, 
in which each S’s performance was 
evaluated grossly for overall evidence 
of learning, regardless of the slope of 
the gradient. This analysis, utilizing 


showing a de- 


TABLE 1 


NUMBER OF Ss SHOWING POSITIVE GRA- 
DIENTS (CONSISTENT WITH LEARNING THE 
CorrRECT RESPONSE) AND ZERO OR 
NEGATIVE GRADIENTS, AFTER 
REINFORCEMENTS RIGHT 
AND WRONG 


Psychiatric Group 
| = 26 


College Group 
N =34 


Reinf. 
Posi 
tive 
Gra- | Gradient eins Gra 
dient | *”** dient 


Zero or 
Negative 
Gradient 


Total 


Right 2 17 
Wrong 9 6 


Total 11 23 


a different type of score, is regarded 
as essentially independent of the 
unsatisfactory previous analysis. Ta- 
ble 1 shows the number of Ss whose 
gradients were in the downward or 
“positive” direction indicative of cor- 
rect learning—regardless of the mag- 
nitude of the trend—and the number 
whose gradients were zero or “‘nega- 
tive.”” Thus, Ss were categorized as 
having positive gradients whenever 
the frequency of DAK responses to the 
pooled four and three category ex- 
ceeded that to the pooled two and one 
category. Fisher’s exact probability 
test (Edwards, 1950) used to 
assess the significance of the differen- 
tial effectiveness of Right and Wrong 
in the two groups. The exact prob- 
ability tests indicated: (a) in the 
college group, a significantly greater 
proportion of those receiving Wrong 
had positive generalization  gradi- 
ents than did those receiving Right 
(P = .006); (6) in the psychiatric 
group, the proportion of positive 
gradients among those receiving 
Right was not significantly greater 
than among those receiving Wrong 
(P = .208); (c) the data 
from another direction, the psychi- 
atric group has a significantly higher 
proportion of positive gradients under 


was 


viewing 
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the Right condition than does the 
college group (P = .006), and (d) 
the proportion of positive gradients 
under the Wrong condition is not 
significantly higher in the college 
group than in the psychiatric group 
(P = .208). All probability values 
are given for two-tailed tests. 


DISCUSSION 


For the group of college Ss, the results 
support the hypothesis that Right is a 
weaker reinforcer than Wrong. The 
group of Ss receiving Wrong showed a 
proportion of “‘positive’’ gradients, i.e., 
those consistent with learning of the 
correct response, that was significantly 
greater than that of the Ss receiving 
Right. Upon inspection of the absolute 
gradients of those receiving Right, it 
was apparent that, as a group, their 
gradients were not simply relatively 
flat, but that instead they moved in the 
upward or ‘‘negative’’ direction in a 
trend of approximately equal magnitude 
to the positive gradients of those receiv- 
ing Wrong. This suggests that some 
learning had occurred, but of an incorrect 
concept in which DAK was associated 
with the numbers two and one. 

Informal discussion with Ss concerning 
their rationale for sorting of the WCST 
cards led to the following tentative 
explanation of the differential effective- 
ness of Right and Wrong for the college 
group. When an S responds to the single 
training card, he probably does so in 
accordance with an “hypothesis” or 
ideational response of some sort con- 
cerning the aspect of the stimulus card 
which denotes DAK-ness or VEC-ness. 
This initial hypothesis is almost certain 
to be incorrect because of the ambiguity 
of the task. If S says vec, E says 
Wrong, thereby punishing not only 
the utterance of the incorrect syllable, 
but also the incorrect ideational response 
associated with it. The S thus turns 
to the series of 64 cards with his dominant 
and incorrect hypothesis having been 
punished, and with the correct hypothe- 
sis now somewhat closer to a position of 


AUSTIN JONES 


dominance in his hierarchy of ideational 


responses. If, on the other hand, S 
happens to say DAK to the training card, 
E says Right and thereby reinforces 
not only the utterance of that syllable 
but also the incorrect hypothesis which 
preceded it. Thus, such an S begins to 
sort the cards with his initially dominant 
and incorrect hypothesis made still more 
dominant by £’s reinforcement. Conse- 
quently, he would be expected to be less 
“‘flexible”’ in his sorting, and more likely 
to perpetuate the type of error associated 
with his first hypothesis. 

In general, it is suggested that Wrong 
will have a greater positive effect on 
learning than will Right when the 
dominant hypotheses concerning the dis- 
criminative stimuli are incorrect. An 
answer to the question of the differential 
effectiveness of positive and negative 
reinforcers per se would then await 
research in which the relative initial 
dominance of correct and of incorrect 
hypotheses is equated. 

For the psychiatric group, the Buss, 
Braden, Orgel, and Buss (1956) hypothesis 
that Wrong is a stronger reinforcer than 
Right was not supported; as noted earlier 
the trend was in the opposite direction. 
The interpretation | of this negative 
finding does not seem immediately clear. 


SuMMARY 


A direct test was attempted of the hypothe- 
sis that the word Wrong is a stronger rein- 
forcer than the word Right. Twenty-six 
psychiatric inpatients and 34 college under- 
graduates learned the association of a non- 
sense syllable with the number concept 
“four."’ After single training trial in which 
S's response was followed by E saying either 
“Right” or “‘Wrong,”’ Ss sorted 64 cards of 
the Wisconsin Card Sorting Test without 
further guidance from E. The data consisted 
of the number of Ss whose generalization 
gradients showed decreasing frequencies of 
response from the pooled number category 
4-3 to that of 2-1. In the college group such 
“positive’’ gradients, indicative of learning 
of the correct response, were significantly 
more often associated with the reinforcer 
Wrong than with Right. Such a significant 
difference did not appear in the data for the 
psychiatric group. A comparison between S 
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groups showed that Right was significantly 
more often associated with positive gradients 
in the psychiatric group than in the college 
group. 
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PATTERN MATCHING 


IN THE PRESENCE OF 


VISUAL NOISE! 


MALCOLM D. ARNOULT 


Texas Christian University 


During the past few 
siderable research has been devoted 
to specifying the conditions affecting 
discrimination, recognition, and iden- 
tification of shapes and patterns. Of 
particular interest is the situation 
in which a stimulus is to be matched 
with one which differs somewhat from 
it in appearance, because this situa- 


years con- 


tion could be considered a prototype 
of everyday perception. Completely 
identical are rarely 
yet carry out fairly eff- 
ciently the processes of categorization 
and recognition the 
differences which between 
jects belonging to the same class. 
Research is needed to identify the 
various ways in which visual stimuli 
may be distorted and the effects of 
distortion in relation to the kind of 
stimulus, the kind of perceptual task, 
and the conditions of observation. 


objects seen, 


observers 


despite many 


exist ob- 


Distortion has been used as an independent 
variable in experiments with both shapes and 
patterns, but the results obtained so far have 
not been easy to generalize from one experi- 
ment to another. Consequently, only the 
results obtained with distorted patterns will be 
discussed in relation to the present experiment. 

French (1954) used patterns composed of 
small numbers of dots and introduced visual 
noise (distortion) by adding dots randomly 
to the pattern. He found that errors de- 
creased as the number of dots in the basic 
pattern increased (up to nine) and increased 
as the number of noise dots increased (up to 
eight Pollack (1955) investigated the effect 


a thesis 
the 
Mis- 
require- 
Arts in 


1 This report is based in part on 
submitted by the author to 
Graduate School of the University of 
sissippi in partial fulfillment of the 
ments for the degree of Master of 
January 1960. 


set ond 


AND 
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of visual noise on the recognition of patterns 
constructed by the sequential arrangement 
of stimulus elements into “paths.” In one 
of his conditions S was required to match a 
distorted path to one of a set of alternative 
paths. When the number of alternatives was 
four, for example, he found that a 50% dis- 
tortion (noise level) of the stimulus path 
produced 20% errors in matching, correspond- 
ing to an increase of response uncertainty of 
slightly less than 1 bit. 

Recently Hillix (1960) has reported an 
experiment in which the stimuli consisted of 
patterns of filled and unfilled cells of a 100 
< 100 matrix. He varied the number of 
filled cells from 10% to 50% and the amount 
of distortion from 10% to 40%. He found 
that the number of errors made in matching 
distorted patterns to their prototypes could 
be predicted (r? = .88) from a similarity index 
based on the number of filled cells common to 
the two stimuli. Likewise, response latency 
could be predicted on the same basis (r? =.94). 


The experiment was, in 
general, very similar to the one by 
Hillix, although the specific conditions 
all differed. Cells were filled with 
dots rather than squares, the number 
of filled cells was determined on a 
probability basis (P = .50), distortion 
was introduced by adding or removing 
filled cells rather than by displacing 
them, and the conditions of stimulus 
presentation differed in a number of 
details. Nevertheless, there was con- 
siderable similarity between the tasks, 
and it is of interest to determine 
whether the data can be described by 
the same empirical function. 


present 


METHOD 


Stimuli.—A number of prototype patterns 
was constructed, and for each of these a 
number of test patterns was constructed by 
introducing a specified amount of randomly- 
determined variation into the prototype pat- 
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tern. The prototypes were constructed on 
8 X 8 matrices, with dots being placed in the 
64 cells with a probability of .50. The actual 
number of dots on the completed prototypes 
varied from 24 to 36, with a median of 29.5 
Four prototypes were assigned to each of 
four conditions of visual noise, and 7 test 
patterns constructed for each prototype, 
making a total of 112 test patterns in the 
experiment. 

Test patterns were constructed by dis- 
torting the prototypes in the following way: 
a randomly selected cell of the matrix was 
changed to the opposite state; i.e., if it was 
vacant, a dot was added, and if it had a dot, 
the dot was deleted. The number of cells 
chosen for alteration was determined by the 
noise level desired. In the 0% noise condition 
no cells were changed; each test pattern was 
identical to its prototype. The 5% noise 
level was approximated by changing 3 of the 
64 cells. Similarly, 10% involved 
changing 6 cells, and 20% involved 
changing 13 cells. Thus, each test pattern 
at a given noise level was equivalent to each 
other test pattern at that level but differed 
in terms of the particular cells which had 
been changed. 

The test patterns were made by photo- 
graphing a matrix of illuminated circles. The 
resulting negatives were bound into 2 X 2 in. 
slides for use in the projector. The prototype 
patterns were cut out of heavy poster board 
and backed by black paper. The dot pattern 
was centered on a 10 X 10 in. field, which 
was the size of the field on which the test 
patterns would be projected. Figure 1 is an 
example of the test situation as viewed by S. 

Apparatus.—The test patterns were pro- 
jected by an ordinary slide projector equipped 
with an aluminum leaf-type shutter cantrolled 
by a rotary solenoid actuated by a Hunter 
timer. Exposure time was 0.7 sec.; inter- 
exposure time was controlled by E and was 
about 5 sec. in duration. 

The projection screen was a 37 X 38 in. 
Masonite board painted flat black on S's 
side. The four prototype patterns for a given 
noise level were attached at the corners of the 
screen. In the center was a 10 X 10 in. 
translucent screen on which the test patterns 
were projected from the rear. The room was 
brightly illuminated by overhead fluorescent 
fixtures, and the contrast in the projected 
pattern was approximately equal to that of 
the prototype patterns. 

Procedure.—The Ss were seated at an 
average distance of 11.5 ft. from the screen 
and provided with answer sheets. For each 
trial the answer sheet provided four boxes 


noise 
noise 


NOISE 


l 
L 


Fic. 1. The matching task as viewed by S. 
(The test pattern appeared in the center for 
0.7 sec. and was to be matched with one of 
the four surrounding prototypes. The pattern 
illustrated is a 20% distortion of the lower 
left prototype.) 


corresponding to the positions of the four 
prototype patterns. 
tions were read: 


rhe following instruc- 


The purpose of this experiment is to 
and rapidly 
complex patterns under 
patterns 


investigate how accurately 
people can identify 
various conditions. All of the 
you will see will be of the same general type 
as the four which you now see in front of 


you. I will flash a pattern on this screen in 
the center; your task will be to determine 
which of these other four patterns it most 
resembles. In some cases the resemblance 
will be quite obvious. In other cases your 
decision will be more difficult. But in every 
case the pattern which is flashed on the 
screen will resemble one of these four 
“prototypes” more than it resembles the 
other three. Is that clear? 

The test pattern will remain on the 
screen only a short time, so you must pay 
close attention if you are to identify it 
correctly. As soon as the pattern disap- 
pears, mark your decision on your answer 
sheet in one of the four boxes provided for 
each trial. If the test pattern most resem- 
bles the upper-left prototype, put an “X” 
in the upper-left box; if it most resembles 
the lower-right prototype, put your “X” 
in the lower-right box, and so forth. Is 
that clear? 

Just before each pattern is flashed on 
the screen I will give you a “Ready” 
signal by calling out the number of the 
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trial. For example, I will say “This is 
Number 1,”.“‘This is Number 2,’’ and so 
forth. 

Keep your hand near the paper so that 
unnecessary searching for the proper trial 
will not interfere with your responses. Are 
there any questions before we begin? 


All 28 patterns for a given noise level were 
presented at the rate of about one every 6 
sec. (0.7-sec. exposure + 5-sec. intertrial inter- 
val), following which there was a short rest 
period during which the prototype patterns 
were changed to those associated with a 
different level of visual noise. The order in 
which the noise levels were tested and the 
positions of the prototype patterns on the 
screen were counterbalanced. A new random 
order of the 28 test patterns within a noise 
level was used with every group of Ss. 

Subjects.—The Ss were 80 college students 
about equally divided between men and 
women. They were unsystematically divided 
into 10 groups of 8 Ss each, and data were 
collected from all members of a group simul- 
taneously. 


RESULTS AND DISCUSSION 


The data were analyzed both in 


terms of the percentage of errors at 
each noise level and in terms of 
amount of information transmitted 
as a function of amount of visual noise, 
or distortion (Attneave, 1959). Fig- 
ure 2 shows that both measures 
yielded functions which do not depart 
significantly from linearity over the 
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Fic. 2. Percentage errors and informa- 
tion transmitted as a function of the per- 
centage distortion (noise level) of the pro- 
totype patterns. 
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range of noise levels used. At the 
highest noise level Ss were making 
slightly over 50% errors and were 
transmitting only 0.20 bits of informa- 
tion of the 2.00 bits inherent in the 
four response categories. 


It is interesting to compare these 
curves with the data reported by Pollack 
(1955) for distorted dot patterns ar- 
ranged to form “paths.’”’” He found 
errors to be a positively accelerated func- 
tion of amount of visual noise, with little 
or no increase in errors below a noise 
level of 20%. Although the methods of 
producing noise are not directly com- 
parable, it is notable that there are only 
slight signs of positive acceleration in 
the error function obtained in the present 
experiment. It is also notable that there 
were 10% errors committed at the 0% 
noise level in this task and only about 
1% in Pollack’s task, suggesting that 
the present task was considerably more 
difficult. The number of pattern ele- 
ments was roughly the same in the two 
studies (28 vs. a median of 29.5), and 
the total number of matrix cells was 
much greater in Pollack’s experiment 
(324 vs. 64), but the constraint he im- 
posed in requiring the elements to form 
a path apparently reduced the total 
amount of information in the patterns 
to a value less than in the dot patterns 
used here. 

The present task was also more dif- 
ficult than that used by Hillix (1960). 
His 50% Fill- 10% Distortion condition 
and 50% Fill- 20% Distortion condition 
are comparable, respectively, to the 10% 
and 20% distortion conditions we used. 
His Ss committed about 25% errors in 
both conditions whereas our Ss com- 
mitted 27% and 50% errors. The dif- 
ference is probably attributable to three 
factors: (a) Hillix’ Ss had to choose 
from among three alternatives, whereas 
our Ss had four; (6) different methods 
were used to produce distortion; and (c) 
his stimuli were exposed until S re- 
sponded (about 9 sec.), whereas our 
stimuli were exposed for only 0.7 sec. 
Despite the differences in overall level 
of performance in the two experiments, 
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S’s responses may have been determined 
by the same of factors in both 
Hillix was able to show that he 
could obtain an r* of .88 between the 
number of correct responses and a meas- 
ure of “similarity’’ based, essentially, 
on the number of filled cells common to 
the standard and the correct alternative. 
Consequently, an attempt was made to 
apply a similar measure of similarity 
to the present stimuli. It was not 
possible to use exactly the same index 
used by Hillix because the number of 
filled cells in common with the standard 
was not necessarily the same for each 
of the seven alternatives constructed. 
Instead, the proportion of filled cells in 
common was computed for each alterna- 
tive and averaged across all seven. 
This procedure resulted in 16 values for 
the similarity index, one for each of the 
four prototypes (or standards) at each 
of the four noise levels. Correlating the 
values of this index with the number of 
correct choices for each prototype at 
each noise level yielded an r* = .94, 
with the equation for the line of best 
fit being C’ = 1.9777 — 1.070, where 
C’ is the predicted number correct and J 
is the index of similarity. 

Thus, for both Hillix’ experiment and 
the present one, it is possible to make 
quite accurate predictions of the number 
of correct responses as a function of the 
amount of distortion introduced into the 
pattern. To be most useful, however, 
the index of similarity should predict 
not only that an error will be made but 
which incorrect alternative will be chosen. 
Every possible pair of patterns should 
have some probability of being chosen 
as a consequence of the number of filled 
cells in common. At each level of dis- 
tortion in the present experiment there 
were 84 possible pairs composed of test 
patterns and incorrect prototypes. The 
index of similarity for each pair was 
correlated with the number of times each 
pair was chosen in the experiment. The 
correlations for the various noise levels 
fell in the range from .25 to .35. These 
were statistically significant, but were 
not large enough to provide a basis for 
accurate predictions. It may be that 


sort 
ases. 


precise prediction of pairings of these 
patterns depends upon complex sub- 
patterns of dots as well as upon the loca- 
tions of individual dots. For example, 
it would be interesting to get indexes 
of similarity based upon pairs of dots in 
common, triplets in common, quad- 
ruplets in common, and so forth. The 
similarity of subpatterns composed of 
white spaces would also possibly be 
important. The kind of analysis being 
proposed would be practicable only if it 
could be automatized and carried out 
by a large computer. If it paid off, 
though, it would provide a possible basis 
for predicting errors of matching made 
with almost any kind of complex pat- 
terns, in that the patterns could be 
quantized as matrices of filled and un- 
filled cells, and the index of similarity 
used to predict the probability of match- 
ing errors made by a human observer. 

In analyzing the incorrect responses 
it was observed that these errors ap- 
peared to be nonrandomly distributed 
among the prototypes and that this 
tendency toward nonrandomness ap- 
peared to increase with increasing dis- 
tortion of the test patterns. This 
observation was tested by computing 
the value of x? for the distribution of 
errors among incorrect alternatives for 
each prototype at each level of distortion. 
The probabilities associated with each of 
the resulting 16 x?’s are shown in Table 1. 
It can be seen that in 8 of the 16 cases x? 
was significant beyond the .05 level, and 


TABLE 1 


PROBABILITIES ASSOCIATED WITH x? TESTS 
OF THE HYPOTHESIS THAT ERRORS ARI 
DISTRIBUTED RANDOMLY AMONG 
THE INCORRECT ALTERNATIVES 


Noise Level 
Proto- 
type 


*P > .0S 
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that both the number of significant x?’s 
and their level of significance increased 


with an increase in the amount of distor- 
tion. It would appear, then, that one 
of the effects of increasing distortion was 
to cause the test patterns to become more 
similar to some of the incorrect 
types than to others. 


proto- 


SUMMARY 


Prototype patterns of dots were con- 
structed by placing dots with a probability 
of .50 in the cells of an 8 X 8 matrix. Dis- 
torted versions of these prototype patterns 
were then constructed by randomly removing 
or adding dots in 0%, 5%, 10%, or 20% of 
the cells in the matrix. Four prototype 
patterns were used at each noise level, and 
seven randomly-differing test patterns were 
constructed for each prototype. Each test 
pattern was presented for 0.7 sec., and S’s 
(N = 80 college students) task was to match 
it to the prototype which it most resembled. 

The major findings were: 

1. Errors increased from 10% at 0% dis- 
tortion to 50% at the 20% distortion level. 
At the same time, information transmitted 
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decreased from 1.38 bits to 0.20 bits. 
relationships were linear. 

2. The increase in errors could be ac- 
counted for (r? = .94) by an index of simi- 
larity based on the number of dots in common 
between the test patterns and the prototype 
at each noise level. The index did not, how- 
ever, account satisfactorily for the distribu- 
tion of errors among the incorrect alternatives. 

3. The distribution of errors among incor- 
rect prototypes was increasingly nonrandom 
as distortion increased. 


Both 
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DEPRIVATION AND REINFORCEMENT ! 
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University of Missouri 


On the basis of a series of studies 
(Collier & Myers, 1961; Collier & 
Siskel, 1959), two clusters of events 
have been isolated which control 
momentary rate of responding. The 
first cluster is the olfactory, gusta- 
tory, kinaesthetic, and other sensory 
consequences of the reinforcing sub- 
stance. The basic relation here is 
that rate is a linear increasing func- 
tion of log intensity of stimulation. 
The second cluster is the momentary 
postingestive load. Rate of respond- 
ing is a decreasing function of con- 
centration. The present study iscon- 
cerned with the effect on these func- 
tions of the nutritive the 
animal. 


state of 


METHOD 


Subjects.—The 172 Ss were naive male 
rats, 90 to 120 days old, of the Sprague- 
Dawley strain obtained from the Holtzman 
Company. Twenty-eight Ss were lost because 
they either failed to drink during magazine 
training or failed to make more than 10 
responses per bar press (BP) session; 23 of 
the losses occurred in the low-deprivation 
condition and 12 of these were lost in the 4% 
condition. This differential loss biases the 
sample. 

Apparatus.—Four Skinner boxes that 
provided liquid reinforcement were used. 
The magazines were 12-in. diameter, covered, 
aluminum plates with 72 .4-ml. cups around 
the periphery. The magazines were loaded 
by means of a Cornwall automatic pipetting 
device. A more detailed description of the 
apparatus is given in Collier and Myers 
(1961). The solutions used were prepared 


1This investigation was supported by 
Research Grant M-3328 from the National 
Institute of Mental Health, of the National 
Institutes of Health, Bethesda, Maryland. 
A preliminary report of these data was given 
in a paper entitled “‘Regulation and rein- 
forcement,’ Midwestern Psychological Asso- 
ciation, 1960, St. Louis, Missouri. 


from commercial sugar 24 hr. before use and 


kept at room temperature no longer than 


» 48 hr. 


Procedure.—Three concentrations of su- 
crose (4%, 16%, and 64%), two volumes of 
reinforcement (.1 and .3 ml.), two fixed inter- 
val, interreinforcement intervals (1 and 4 
min.), and two levels of deprivation (3 hr. 
[L] and 23 hr. [H] 


torially. 


were combined fac- 
Six Ss were in each cell. Three 
hours was chosen as the low deprivation con- 
dition to minimize gastric load. Experimen- 
tal sessions were 30 min. in duration. Seven 
days of table training involving handling 
deprivation adaptation, and drinking of the 
appropriate test solutions, were followed by 
4 days of magazine training, 7 days of bar 
press training, and 3 days of extinction 

On the 3-hr. deprivation schedule, the food 
was removed 3 hr. before the experimental 
session and replaced immediately following 
the session. On the 23-hr. schedule, Ss were 
fed for 1 hr. immediately following the experi- 
mental session. The differential amounts of 
sugar ingested under each of the Concentra- 
tion X Volume X Interval combinations were 
not compensated for in the daily diet. 

Magazine training consisted of 30 presen- 
tations per session of the appropriate volume 
and concentration on a 1-min. variable inter- 
val (V1) schedule. The number of presenta- 
tions consumed was recorded and any S 
which failed to take at least 25 of the 30 
possible on each of the last 2 days of maga- 
zine training was discarded. 

During extinction sessions, the magazine 
was present but did not operate. 

The experiment was run in replications 
of 24 Ss. With the exception of volume, 
conditions were counterbalanced across cages, 
boxes, and time of running. The Ss that 
received .3 ml. reinforcement were started 
when about one-half of those with .1 ml. 
reinforcement were completed. 


RESULTS 
Figure 1 presents the cumulative 
BP curves showing the within session 
course of responding. Figure 2 pre- 
sents the number of responses in the 
first 55min. An analysis of the average 
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Fic. 1. 


Cumulative number of BP as a function of concentration (4%, 16%, 64%), volume 


(.1 and .3 ml.), interval (1 and 4 min.), and deprivation (L and H). 


of the last 2 days (6 and 7) of rein- 
forced BP for concentration, volume, 
interreinforcement interval, depriva- 
tion, and minute of the session is 
presented in Table 1. Figure 3 pre- 
sents the paralle datal for Day 1 


BP/5 MIN 


4° 6 & 
1 (ML) 


4 6 &@ 
3 (ML) 


CONC 


Fic, 2. Total number of BP in first 5 min. 
as a function of concentration (4%, 16%, 
64%), volume (.1 and .3 ml.), interval (1 and 
4 min.), and deprivation (L and H). 


of extinction. An analysis of these 
data for C, V, I, D, and extinction 
sessions, is summarized in Table 2. 
Initial and terminal rate will be 
examined separately. 

Initial rate-——Examination of the 
initial slopes of the curves in Fig. 1 
shows that initial rate is an increasing 
function of concentration. Compari- 
son of Fig. 3 with Fig. 2 further shows 
that the linear relation between rate 
and log concentration is obscured by 
the decremental effects of even a few 
reinforcements and their consumma- 
tory and postingestive consequents. 

Initial rate is an increasing function 
of volume as shown in Fig. 2. A 
similar relation is shown in the extinc- 
tion data of Fig. 3, where the consum- 
matory and postingestive effects are 
absent. Volume and concentration 
interact (Fig. 2 and 4) so that the 
slope of the initial rate-concentration 
function is greater at higher volumes. 
Figure 4 presents the volume-concen- 
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ANALYSIS OF VARIANCE OF THI 
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Days 6 AND 7 
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tration extinction data collapsed over 
interval. In the present experiment 
at the lowest concentration the vol- 
umes used do not produce very dif- 
ferent initial rates. Other data ob- 
tained in this laboratory indicate that 
this difference at the near threshold 
concentration is a function of the 
kind (e.g., thirst or hunger) and 
degree of deprivation. 

Initial rate is a decreasing function 
of interreinforcement interval. This 
relation is preserved in the number of 
extinction responses of Day 1 of 
extinction, but on Days 2 and 3 the 
4-min. interval groups show greater 
resistance to extinction (Table 3). 
While the interactions involved do not 
quite reach significance, it appears 
that the relation (Fig. 3 and Table 3) 
between interval and number of ex- 
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tinction responses in the first session 
is a complex one. 
in number of appears to 
hold only at small volumes and to be 
increased by deprivation (Fig. 3). 
The slope of the initial rate-concentra- 
tion function is a decreasing function 
of interval (Fig. 1). This relation 
is very rapidly contaminated by 
consummatory and postingestive ef- 
fects. As can be seen in Fig. 3, the 
greater resistance to extinction of the 
long-interval groups at large volumes 
confounds this relation in extinction, 
with the large-volume long-interval 
groups showing the greatest resistance 
to extinction. Neither the 
study nor the Collier and Myers 
(1961) study yields any evidence of 
an interaction of volume and interval 
for initial rate, although there is a 
substantial one over the course of a 


The superiority 
responses 


present 


session as will be noted later. 

Initial rate is an increasing function 
of deprivation. Of greater novelty 
and more interest are its interactions. 


Fic. 3. Total number of BP in first extine- 
tion session as a function of 
(4%, 16%, 64%), volume (.1 
interval (1 and 4 min.), 


(L and H). 


concentration 
and .3 ml 


and deprivation 
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rABLE 2 


ANALYSIS OF VARIANCE OF TOTAL NUMBER 
or BP IN THE THREE EXTINCTION 
SESSIONS 


Source { MS 
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increasing function of deprivation 
(Fig. 1, 2, 3, and 4). The course of 
this relation over extinction sessions 
is shown in Table 3. Table 3 further 
shows that the rate-volume functions 
sanaaain for high and low deprivation are 
** P= ‘Ol. essentially parallel.2 The effect of 
increasing deprivation on the rate- 
The interactions between concentra- interval relation is to increase the 
tion and deprivation and interval and_ slope of the rate-interval function. 
deprivation (for initial rate) reach Aa I Ect ig chasis |S IP eg a 
significance while that for volume and laboratory in “4 sonent study ph ae gs 
deprivation does not. ; The slope Ol wider range of volumes and using water for 
the rate-concentration function is an _ reinforcement. 
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TABLE 3 


TotaL NUMBER OF BP IN EXTINCTION AS A FUNCTION OF INTERVAL, 
CONCENTRATION, AND VOLUME 





| Interval 
Extinction | Deprivation 
Session | 


1Min. | 4Min. | 4% 








119.9 | 1026 | 49.3 
54.9 38.9 
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Fic. 5. Number of BP in each 5 min. of reinforced BP as a function of concentration (4%, 
16%, 64%), volume (.1 and .3 ml.), and deprivation (L and H) for 1-min. FI. 
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Number of BP in each 5 min. of reinforced BP for concentration (4%, 16%, 64%), 
volume (.1 and .3 ml.), and deprivation (L and H) for 4-min. interval. 
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TABLE 4 


AVERAGE NUMBER OF 


REINFORCEMENTS TAKEN AND 


ToTaAL AMOUNT OF 


SuGAR CONSUMED 


3-Hr. Deprivation 


1 min. 
4 min. 


.13) 
1 min. 


4 min 


gm/ml; 16% 
parentheses. 


= .170 gm/ml; 64% = .838 gm/ml; 


There is little or no difference in 
rates between the 1-min. and 4-min. 
intervals at low deprivation while the 
initial rate is highest for the 1-min. 
interval at high deprivation. This 
relation is preserved only for the .1- 
ml. volume in extinction. As previ- 
ously mentioned, no significant dif- 
ferences in rate of responding for the 
two intervals occur at the .3-ml. 
volume in extinction. 

Terminal rate-—Maximal within-ses- 
sion declines in rates of responding 
occur under two conditions: combina- 
tions of low concentrations, small 
volumes, and long intervals; and 
combinations of high concentrations, 
large volumes, and short intervals 
(Fig. 1. See also Collier & Myers, 
1961). The effect of deprivation on 
these relations can best be seen from 
Fig. 5 and 6 which present the per- 
centage of total number of responses 
occurring in each 5-min. interval for 
the last day of the 1-min. interval 
data and the 4-min. interval data, 
respectively. 

It can be seen by examining the 
shifting relation between rate of 
decline and volume across concentra- 
tions, that for the 4% sucrose the 
volumes (which still result 
in a small load of solute) result in the 
least within-session declines (irre- 


largest 


.50) 


28.6(2.40) 


Note.—Thirty reinforcements were possible on the 1-min. schedule; 8 on the 4-min. schedule. 4% = .040 
7 / total grams 


-Hr. Deprivation 
64% 4% 16% 64% 
29.9(2.51) 
8.0( .67) 


(.12 29.7 ( 
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.65) 7 


3 
.4(.03) 
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29.0(.35) ) 
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7.1(.09) 


16.3(4.10 


29.9(1.52 
7.1(1.78 1 


8.0( . 


sugar (solute) consumed/session shown in 


spective of deprivation); for 16% 
concentration the largest volume, 
yielding three times the load of solute 
(Table 4), shows a more rapid within- 
session decline (relatively independ- 
ently of deprivation); for 64% su- 
crose, the largest volume produces 
the greatest within-session decline, 
and the low deprivation, the most 
rapid decline. It is interesting to 
note that for this latter group, at .1 
ml. per reinforcement, the rate and 
amount of solute loaded is the same 
for both the high and low deprivation 
conditions (Table 4). Thus, the 
relative ability of a given load above 
a certain value to “shutoff” 
sponding is less the greater 
hunger. 


re- 
the 


DISCUSSION 


In summary, the preceding data show 
that initial rate is a linear increasing 
function of log concentration of sucrose 
and that the slope and intercept (thresh- 
old rate of response) of this function 
are jointly determined by the interval 
between reinforcements, the volume per 
reinforcement, and the level of depriva- 
tion. Each of these variables produces 
a family of diverging rate-concentration 
functions. The intercept of the initial 
rate-concentration function is an in- 
creasing function of volume and depriva- 
tion, and a decreasing function of inter- 
val. Both the BP data and the extinc- 
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tion data (free of ingestive and post- 
ingestive effects) show that deprivation 
interacts with concentration but not 
with volume. Instead, for volume, a 
series of parallel rate-volume functions 
results when deprivation is varied. The 
effect of deprivation on interval is to 
increase the difference in rates of re- 
sponding at different interreinforcement 
intervals. Declines from the _ initial 
rate occur as functions of two classes of 
events, the number of ingestive responses 
made and the colligative properties 
(osmotic pressure, etc.) of the momen- 
tary gastric load. The decline in rate as 
a function of the number of responses 
(i.e., minimal reinforcement conditions) 
is a decreasing function of concentration 
and appears to be little affected by vol- 
ume (amount of ingestive activity) or 
deprivation while the decline in rate as 
a function of momentary gastric load is 
a joint function of the amount of solute 
consumed and the state of deprivation. 
Differential declines in rate as a function 
of these variables are responsible for the 
nonmonotonicity of all but the initial, 
rate-concentration functions. Variations 
in the character of a reinforcement are 
reflected in variations in the performance 
of the behavior of which it is a conse- 
quent. Examination of the temporal 
course of this relation has led to the 
view that there are at least two loci 
for these rate governing variables, the 
proximal reinforcing stimuli and the 
immediate postingestive concentration. 
The results of the present study show a 
to-be-expected third locus of a rate 
governing event—the nutritive state of 
the organism. Momentary rate of food 
reinforced responding is an increasing 
function of the hours of food deprivation. 
Two possible points in the ingestive 
chain where deprivation may exercise 
its effects are examined in the present 
study: (a) the relation between the 
sensory consequences and rate of re- 
sponding, and (5) the relation between 
rate of shutoff and momentary load. 
Considering, first, the effects of depri- 
vation on rate of shutoff, there are two 
sets of variables to be considered, the 
ingestive responses and the gastric load. 
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The occurrence of a 
decline in rate of responding under 
conditions of minimal reinforcement 
indicates that occurrence of a response 
leads to a decrement in response likeli- 
hood. The fact that the same level of 
responding is maintained from session to 
session precludes an explanation in terms 
of extinction, and the small gastric 
load involved precludes an account in 
terms of satiation. It may perhaps be 
denoted as response habituation (Collier 
& Myers, 1961). As would be expected, 
it is insensitive to deprivation. Rate of 
shutoff, excluding the minimal rein- 
forcement conditions, is a joint function 
of concentration, volume consumed, and 
deprivation. The interaction of the 
first two of these variables suggests that 
tonicity is the variable involved. That 
volume itself is not sufficient to produce 
a decline is apparent from the 4% and 
16% groups. At high concentrations 
the effect of volume is greater. The 
volume effect may simply reflect local 
dilution. This view is supported bythe 
contrast between the 16% and 64% 
groups. 

Sixteen percent sucrose is close to 
tonicity (approximately 9% in the rat 
for sucrose) and this is reflected in the 
fact that neither volume per reinforce- 
ment nor interval between reinforce- 
ments, as reflected in the load taken 
on in 20 min., is very effective in the 16% 
groups in producing differential declines 
in rate of responding. 
exercises its effect is not clear from 
the present study. Two alternatives 
are that (a) deprivation modifies the 
proportionality relation between rate of 
responding and the tonicity of the gastric 
load, or (b) that it does not have an 
effect on this relation and the effect 
observed results from the fact that at 
3-hr. deprivation sufficient residual gas- 
tric load from previous feedings is 
present to interact with the current load. 
The latter conclusion is suggested by 
the results of the 16% group which again 
show little effect as the result of different 
deprivations. 


within-sessions 


How deprivation 


Of the three sensory parameters ex 


amined (volume, 


concentration, and 
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interval) deprivation appears to interact 
with only the latter two. The slopes 
of the initial rate-concentration and 
initia! rate-interval functions are in- 
creasing functions of deprivation while 
the slope of the initial 
function is not. 

The effects of deprivation and vol- 
ume are additive, while those of con- 
centration and deprivation and interval 
and deprivation are multiplicative. Thus 
it appears that deprivation exerts its 
influence on the rate of ingestion via 
an effect on the relation between the 
intensity of the reinforcing stimulus and 
the rate of performance of the rein- 
forced Volume does not 
change the intensity of stimulation, and 
therefore, changes in deprivation should, 
on this view, result in a family of parallel 
rate-volume curves. 


rate-volume 


response. 


At small volumes, 
where dilution may play a major role, 
volume and concentration interact in a 
fashion similar to the intensity-area 
functions (Collier & Myers, 1961). An 
account of the interval-deprivation inter- 
action is not The times in- 
volved are too long for summation in the 
ordinary sense. If it is assumed that 


obvious. 


adaptation is proportional to intensity, 


then the effect observed could result 
from the greater recovery from adapta- 
tion at the long intervals. 

An attractive hypothesis has been 
that deprivation lowers the absolute 
gustatory thresholds (e.g., Campbell & 
Sheffield, 1953; Hebb, 1955) The 
diverging rate-concentration functions 
having a common origin obtained in the 
present study do not support this con- 
jecture. If rate of responding is equated 
with sensitivity, the present data are 
consistent with the hypothesis that dif- 
ferential gustatory thresholds are modi- 
fied by deprivation; but it seems more 
reasonable to assume that rate and 
sensitivity are not equivalent and that 
the proportionality relation between 
intensity of stimulation and rate of 
responding is modified by deprivation. 
Such an assumption would account for 
such preference threshold data as those 


of Campbell (1958). In this study, 
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which investigated the percentage of 
times that S will drink more of one 
concentration when compared with a 
second concentration or with water, 
Campbell found lower absolute and dif- 
ferential preference thresholds with in- 
creasing hours of deprivation. Although 
these results involved both sensory and 
postingestive factors, it seems that they 
are mainly the result of differential 
initial rates of responding and of dif- 
ferential rates of shutoff for the pairs of 
solutions compared and thus the result 
of changing proportionality between 
rate and ‘intensity rather than changing 
thresholds. 


SUMMARY 


The relations between initial and terminal 
rate of bar pressing and concentration, vol- 
ume per reinforcement, and interval between 
reinforcements were explored as functions 
of deprivation. The slopes of the initial rate- 
concentration and initial rate-interval func- 
tions were functions of deprivation while the 
initial rate-volume function was not. The 
rate of ‘“‘shutoff’’ was most sensitive to de- 
privation at large concentrations, large 
volumes, and short intervals. Three inde- 
pendent loci of events controlling food inges- 
tive behavior were suggested: the proximal 
reinforcing stimuli, the momentary post- 
ingestive load, and the nutritive state. 
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This paper reports a simple two- 
alternative noncontingent probability 
learning experiment with an uncon- 
ventional feature: each S made 1000 
consecutive predictions, making pos- 
sible very detailed analysis of re- 
sponses which occurred after learning 
was essentially completed. 

Some abbreviations will be useful. 
The S predicts either L or R; after 
each prediction he observes either | 
orr. The probability that S will make 
prediction L on trial ¢+ 1 will be 
called p,. The probability of | on 
any trial is a constant for any given 
S; it will be called x. The occurrence 
of a prediction will be called a re- 
sponse; the occurrence of a display 


of an event following a response will 


be called an outcome. An outcome 
follows each response; the nature of 
the outcome is independent of the 
nature of the response. 

The interpretation of this experi- 
ment will focus on three issues: 


1. The probability matching hypothesis. 
The probability matching hypothesis 
(PMH) asserts that the asymptotic 
probability of choice, p, (p. = lim pr; 

toa 
it is assumed that this limit exists) 
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equals r. It was originally proposed 
by Grant, Hake, and Hornseth (1951), 
is predicted by the Estes learning model 
(Estes, 1950, 1957; Estes & Burke, 
1953; Estes & Straughan, 1954) and 
by the equal-alpha case of the Bush- 
Mosteller learning model (Bush & Mos- 
teller, 1955), and has been supported by 
a number of experiments, though not by 
others. 

2. The extreme-asymptote generaliza- 
tion. In 1956 I reported an experiment 
which argued against PMH and in favor 
of a theory about p, which I call the 
extreme-asymptote generalization. That 
generalization asserts that p, is more 
extreme than zw, and as the absolute 
value of the difference between mw and 
0.5 increases the difference between ), 
and increases until p, 
comes 1 or 0. As stated, this hypothesis 
makes only ordinal predictions; a way 
of making it yield ratio scale predictions 
(and also of applying it to situations in 
which amount of payoff is varied) is 
discussed in Edwards (1956) and applied 
later in this paper. 

3. Sequential dependencies, the gamb- 
bler’'s fallacy, and path independence. 
Stochastic learning theories often assume 
that the effects of events prior to a given 
trial are summarized in a set of prob- 
abilities for the responses available on 
that trial; this assumption is known 
as the path independence assumption 
(for a better definition, see Bush & 
Mosteller, 1955, p. 17). Contradictory 
to this is the common observation that if 
a flipped coin comes up heads eight or 
nine times in a row, S is likely to decide 
that tails is ‘‘due’’ and so predict or bet 
on it on the next toss. This and similar 
sequential effects have been called the 
gambler’s fallacy ; they have been demon- 
strated experimentally by many Es. 
Other hypotheses about sequential ef- 
fects in probability learning have also 


also be- 
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been proposed, with varying degrees of 
empirical support. 


Adequate study of each of these 
issues depends on long experiments; 
the reasons why will be examined 
in the discussion section. Both PMH 
and sequential dependencies are harder 
to examine at more extreme prob- 
abilities than at less extreme ones. 
So this experiment used only the 
probabilities 0.5, 0.6, and 0.7 and 
their complements. 


METHOD 


Apparatus—Each S was given a tray 
containing 1? IBM mark sense multiple 
choice answer sheets. On top of the stack 
of sheets was a covering board with 80 pairs 
of holes in it, each hole filled by an ordinary 
cork. Each hole exposed two adjacent spaces 
where a mark could be made on the answer 
sheet. The mark sense sheets were prepared 
in advance by filling in one of the two mark 
spaces under the right-hand hole of each 
pair. 

Subjects.—The Ss were 120 basic airmen, 
trainees at Lackland Air Force Base. They 
were unselected except that no S who fell 
in Category 4 (the lowest category) of the 
Armed Forces Qualification Test, a paper- 
and-pencil test of general intelligence, was 
used. But the population of basic airmen 
includes relatively few college level men 
The Ss used in this experiment, therefore, 
are selected from a population which has 
almost no overlap with the college population 
from which Ss have been selected for other 
probability learning experiments, except 
those by Neimark and Shuford (1959) and 
Nicks (1959), who also used basic airmen 

Instructions.—Each S was told to lift the 
upper left hand cork, and to make a mark in 
either the left or the right space on the sheet 
underneath it. A mark in, for instance, the 
left space was a prediction that the left space 
under the other cork of the pair would turn 
out to be filled in. After making the mark, 
he lifted the other member of the pair of 
corks, and saw whether his prediction had 
been correct or incorrect. After this, he 
replaced both corks, lifted the cork immedi- 
ately beneath the first one he had lifted, and 
made his next prediction. When he finished 
80 predictions, he removed the covering board, 
put the finished answer sheet underneath the 
stack, replaced the covering board, and con- 


tinued making predictions. All Ss were 
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instructed: “Your purpose is to get as many 
predictions correct as possible. You will not 
be able to get all of them correct at any time 
during the test. There is no pattern or system 
you can use which would make it possible to 
get all of your answers correct. But you will 
find that you can improve your performance 
in the test if you pay attention and think 
what you are doing.” 

Experimental design.—There were 12 groups 
of 10 men each; each S made 1000 binary 
predictions in one unbroken session, usually 
lasting about 3 hr. The Ss came in groups of 
12; each S was arbitrarily assigned to one of 
the experimental groups. Twelve Ss and E 
sat at a long conference table; E monitored 
continuously to make sure that all Ss fol- 
lowed instructions and kept at the task. 
No effort was made or needed to pace Ss 
Each S present at a given time was a member 
of a different experimental group from all 
others then present, so no S could profit 
from looking at another S’s predictions. 

Three basic probabilities were used: 0.5, 
0.6, and 0.7. These numbers are the prob- 
abilities that a prediction of left will be 
correct. Sequences of 1000 trials embodying 
these probabilities were prepared in two dif- 
ferent ways, which this paper will call con- 
strained and random. All constrained se- 
quences were prepared as follows. First, the 
expected number of occurrences of runs of 
length 1, 2, ..., m for each of the two 
alternatives was calculated, up to a value of 
n for which that expected number is less than 
0.5. All numbers were rounded off to integers. 
The runs of | were put in one box, the runs 
of r were put in another, and runs were drawn 
at random from the two boxes alternately 
until both wereempty. This procedure makes 
not only run lengths but also conditional 
probabilities (based on sequences which are 
short compared with m) come out at their 
expected values. The random sequences were 
simply chosen from a table of random num- 
bers in accordance with their probabilities, 
with no constraints at all. 

Three probabilities and two ways of pre- 
paring sequences require six sequences. Six 
more sequences, each a mirror image of one 
of the six original sequences, were also used. 
The mirror image sequences were prepared by 
substituting an | for each r and an r for each 
1. One of these 12 sequences was adminis- 
tered to each of the groups; all Ss in a group 
got the same sequence. 


RESULTS 


Asymptotic probabilities —Figure 1 
shows mean relative frequencies of 
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choice by blocks of trials. Each data 
point represents 40 binary choices 
by each of 10 Ss, or 400 binary choices 
in all. In each of the eight groups for 
which the probability of reward is 
not 0.5 and so for which PMH and 
the extreme-asymptote generalization 
make different predictions, the results 
support the extreme-asymptote gen- 
eralization. Inspection of the 50-50 
groups suggests that there is a bias in 
favor of the R response (which is sur- 


CONSTRAINED 
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prising, since for a right-handed S the 
L response is a trifle easier to make), 
but the bias is not large enough to 
affect the finding. 

Inspection of Fig. 1 indicates that 
Ss tended to follow local changes in 
the probability of reward. A local 
increase in frequency of | events 
produces a local increase in frequency 
of L predictions, and similarly for 
decreases. This effect is superim- 
posed on the slower and larger changes 
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TRIALS IN 40-RESPONSE BLOCKS 


Fic. 1. Probability of left response in 


40-trial blocks. 
rectangle is the x for that group, as is also the thin horizontal line within each rectangle. 


(The number at the top of each 


Each 


data point connected with solid lines is the relative frequency of prediction of left on a given 


block of trials; each point is based on 400 binary choices. 


Each point connected with dashed 


lines is the relative frequency with which the left event actually occurred in that block of 40 


trials.) 
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TABLE 1 


PERCENTAGE OF PREDICTIONS OF LEFT ON Last 80 TRIALS FoR Eacu S Not 
IN A 50-50 Group 


Constrained | Constrained Random 


Random 


100% 100% 
97 93 
96 | 88 
95 88 
91 87 
85 85 
80 80 
75 65 
70 60 
58 56 


x =04 


xr = 0.3 
| (Mirrors of 0.6 Groups) 


(Mirrors of 0.7 Groups) 


| 
Random 


Constrained 


| Constrained | Random 
48% ¢ vf 
47 
46 
43 
43 
31 
29 
22 
20 3 0 
0 0 


Note.—The actual relative frequencies of outcomes in the last 80 trials deviated slightly from the theoretical 


probabilities. 


They were 0.73 for the 0.7 constrained group, 0.74 for the 0.7 random group, 0.61 for the 0.6 con 


strained and random groups, 0.28 for the 0.3 constrained group, 0.26 for the 0.3 random group, and 0.39 for the 0.4 


constrained and random groups. 


If these rather than the theoretical probabilities are used in the nonparametric 


test discussed in the text, no change in conclusions results. 


in prediction with which PMH and 
the extreme-asymptote generalization 
are concerned. 

Finally, inspection of Fig. 1 indi- 
cates that the difference between con- 
strained and random sequences is 
relatively unimportant except for the 
fact that constrained sequences come 
out more nearly to the expected 
number of I’s and r’s in each block of 
trials, and so provide slightly less 
scope for the probability following 
phenomenon discussed above to be- 
come visible. 

A significance test for the difference 
between the estimated p, and 7 is 
desirable. Table 1 exhibits the per- 
centage of choices of L on the last 80 
trials for each S, omitting 50-50 
groups. Only 16 Ss out of 80 have 
estimated p,, equal to or less extreme 
than wr. If PMH were correct, at 
least half the Ss should have estimated 
p. equal to or less extreme than z. 
The difference is significant beyond 
the .0001 level. Table 1 also makes 
it clear that the distribution of esti- 
mated p, is not bimodal; indeed, it 
looks relatively normal. That fact 
permits the use of more sensitive 


parametric tests of significance 


but 
the results of the nonparametric test 
given above makes the use of more 
sensitive tests unnecessary. 


Since so many data were collected, 
a number of the variables and inter- 
actions not mentioned here were in 
fact statistically significant; this dis- 
cussion has dealt with all which are 
believed to be also intelligible and 
important. All subsequent statistics 
will combine corresponding random 
and constrained groups and will com- 
bine all 50-50 groups. Each statistic 
was calculated separately for each of 
the 12 groups; in no case does the com- 
bining average numbers or functions 
which appeared dissimilar. 

Sequential effects: Information anal- 
ysts.—To study the determiners of 
responses in a more specific way than 
Fig. 1 permits, detailed examination 
of sequences of responses and out- 
comes is necessary. For this purpose, 
multivariate information transmission 
analysis (Garner & McGill, 1956; 
McGill, 1954) is exceptionally con- 
venient. ‘The model underlying the 
use of this statistic assumes stable 
conditional probabilities; the analysis 
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avoided basing calculations on chang- 
ing overall probabilities by using only 
the last 480 trials. Special attention 
to the nonorthogonality of predictor 
variables and to the choice of proper 
degrees of freedom for the Miller- 
Madow (1954) bias correction and 


significance test was necessary; for 


a discussion of these issues and related 
ones concerning the application of 
information statistics to sequences of 


responses, see Edwards (1954; in 
press). 

Figure 2 shows the effect of taking 
increasingly remote predictor vari- 
ables into account in predicting re- 
sponses in the last 480 trials. (in all 
information calculations, no differ- 
ences worth noting existed between 
original and mirror groups, so they 
are combined in Fig. 2 and 3.) Note 
that the y axis is the percentage 
of information in the responses not 
accounted for by the predictor varia- 
bles considered. It is evident that 
although increasing numbers of pre- 
dictor variables improve predictions 
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Fic. 2. Percentage of total response in- 
formation unexplained by various predictor 
variables. (The x axis is cumulative. At 
Step 0, only Ss are used as predictor variables. 
At Step 1, Ss and the immediately preceding 
outcome are used. At Step 2, the variables 
already listed and the immediately 
preceding response are used. At Step 3, 
the variables already listed and also the second 
preceding outcome are used. And so on. 
Only the last 480 trials were used.) 
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Fic. 3. Amount of information in bits 
transmitted by preceding trials to the present 
response. (The Ss and trials which intervene 
between the predictor trial and the predicted 
response are held constant. Only the last 
480 trials were used.) 


(amathematical necessity), the asymp- 
totic level of predictive effectiveness 
leaves about 75% of the response 
information unexplained. If these 
numbers were variances, this would 
seem like a very large amount of 
unexplained variance. But they are 
not variances; they are ratios of bits 
of information. Users of miulti- 
variate information transmission anal- 
ysis always report large percentages of 
unexplained response information; in 
fact, experiments in which as much 
as 25% of response information is 
explained by predictor variables are 
very rare (except in psychophysical 
scaling). No formal discussion of 
this common finding is known, but an 
obvious hypothesis is that the loga- 
rithmic nature of the information 
measure accounts for this difference 
between information and _ variance 
analyses. 

Figure 2 shows how much predic- 
tion can be done, but does not show 
how to do it. In order to get a better 
idea about that, consider Fig. 3. It 
shows the amount of information 
(in bits, not a ratio) transmitted to 
the present response by the pre- 
ceding three trials (calculations for 
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the second and third preceding trials 
hold what happened in intervening 
ones constant). Again calculations 
are for the last 480 trials only. It is 
apparent that the most information 
is transmitted by the immediately 
preceding trial, and lesser amounts 
by trials prior to that. All amounts 
of information in Fig. 3 are signifi- 
cantly different from zero by the 
Miller-Madow test (1954). 

What is doing the transmitting 
from each trial to the present re- 
sponse? It could be responses, out- 
comes, interactions between them, 
or any combination of these three 
factors. Unfortunately, the inter- 
actions between responses and out- 
comes are not directly interpretable 
because of the nonorthogonality of 
the predictor variables. Figure 2 
is based on a definition of outcomes 
as being | or r; call this mnoncon- 
tingent coding. It would also be 
possible to define outcomes as + or 


— (meaning in agreement or disagree- 
ment with the preceding prediction) ; 


call this contingent coding. Further 
analysis of the data using noncon- 
tingent coding shows that almost all 
information transmitted by a trial 
is transmitted by its outcome; the 
amount of information transmitted 
by responses is trivial (though signifi- 
cant; because of the large numbers 
of responses involved, just about all 
differences which are observable at 
all are significant in this experiment). 
The implication, a sensible one, is 
that Ss pay little or no attention to 
their own previous responses and 
instead concentrate on the previous 
set of outcomes in determining their 
present response. 

Of course, similar analysis applied 
to contingently coded data shows 
that almost all information trans- 
mitted by a trial is transmitted by its 
response; this is an inevitable conse- 
quence of the fact that a + or — is 
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TABLE 2 
INFORMATION IN Bits TRANSMITTED FROM 
PREVIOUS RESPONSE TO PRESENT RE- 
SPONSE BY THREE METHODS OF 
CALCULATION 


Analysis - — 


| 0.6 | 


Intervening outcome ignored 02 .040 
Intervening outcome held 
constant, noncontingent coding 
Intervening outcome hel | 
constant, contingent coding 


045 | .020 
131 | 058 


meaningless as a predictor variable 
unless the preceding response which 
defines it is also considered. So two 
different methods of coding the data 
lead to two different interpretations 
of the results. A decision between 
these interpretations would require 
examination of the interactions, and 
nonorthogonality rules out the obvi- 
ous ways of doing so. But a stab at 
it is available. If only the trial 
immediately preceding a response is 
considered, then the information trans- 
mitted from the response and _ in- 
formation transmitted from the out- 
come should be orthogonal to each 
other. The information transmitted 
from the response can be calculated 
two different ways: with the effect 
of the outcome partialled out, or with 
the effect of the outcome uncontrolled. 
Table 2 presents the results of these 
two methods of calculation for each 
method of coding. No substantial 
difference between methods of calcu- 
lation appears unless the method of 
coding forces it to appear by making 
the outcome variable taken by itself 
meaningless. For that reason, this 
paper used the noncontingent method 
of coding, and will accept the con- 
clusion that Ss are much more con- 
cerned with previous outcomes than 
with their own previous responses. 
Conclusive resolution of the dilemma, 
however, would require a_ three- 
alternative experiment, in which case 
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contingent and noncontingent coding 
would not in general lead to the same 
amounts of information transmission. 
Sequential effects: Run analyses. 
The information statistics presented 
above examine sequential effects in 
a manner which assumes that the 
extent of sequential dependency is 
independent of the particular se- 
quence considered. Clearly that as- 
sumption can be only a first approxi- 
mation. The literature suggests that 
one kind of past history is especially 
likely to lead to sequential effects: 
homogeneous runs of previous out- 
comes. Rather than examine such 
runs by information methods, it is 
easier to examine conditional prob- 
abilities based on them directly. 
Figure 4, again based on the last 480 
trials only, shows the conditional 
probability (multiplied by 100) that 
L will be predicted given each possible 
preceding homogeneous outcome run 
of length eight or less. The data do 
not permit these probabilities to be 
estimated for longer runs with ac- 
ceptable accuracy. An example may 
make the interpretation of the x 
axis easier. The value 4 on the right 
run side of the x axis means, for 
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Fic. 4. Percentage of left responses fol- 
lowing homogeneous outcome runs. (The x 
axis indicates the number of left or right out- 
included in the run for further ex- 
planation, see text. Only the last 480 trials 
were used.) 
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Fic. 5. Percentage of left responses fol- 
lowing homogeneous outcome runs for 50-50 
group Ss only. (The axes have the same 
meaning as those in Fig. 4.) 


example, that the points plotted above 
it are conditional probabilities of pre- 
dicting L given that the last five out- 
comes preceding the prediction were 
Irrrr. (Note that one actually knows 
five preceding outcomes, not four, 
since the outcome preceding a homo- 
geneous outcome run of r must 
necessarily be 1, and vice versa.) 
Figure 4 justifies the conclusion that 
outcome runs of length up to four 
certainly influence responses, and so 
indicates that for at least some past 
histories the extent of sequential 
dependencies is longer into the past 
than the information analysis taken 
alone would suggest. But the nature 
of the dependencies is that the longer 
an outcome run gets, the more likely 
S is to predict that outcome. What 
happened to the gambler’s fallacy? 

- Most experiments which have found 
gambler’s fallacies used fewer trials 
than this one. Perhaps the gambler’s 
fallacy is a phenomenon of early 
trials and vanishes later. If so, 
strictly speaking no run curves like 
those in Fig. 4 are appropriate to use 
in studying it during early trials, 
while response probabilities are chang- 
ing rapidly. But it is reasonable to 
assume as a first approximation that 
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at least for the 50-50 groups the over- 
all probabilities are not changing 
very fast, and so curves like those in 
Fig. 4 can be based on early trials for 
those groups. Figure 5 presents such 
curves for Trials 1-200, 201-400, 


and 401-1000 for all 50-50 Ss. A 
small gambler’s fallacy, much smaller 
than any previously reported, ap- 
pears in the first 200 trials; there- 
after the pattern of run effects sys- 
tematically shifts in 

of those found in Fig. 4. 


the direction 


DISCUSSION 


Probability matching.—In 1956 I re- 
viewed all experiments relevant to a 
narrow definition of PMH _ published 
up to that time (Edwards, 1956, pp. 
184-185). Only experimental groups 
in which the two outcomes were mutually 
exclusive and exhaustive, in which 
successive outcomes were independent, 
in which w was not 0, 0.5, or 1, and in 
which S had had no previous experi- 
mental experience with a different value 
of mw were considered. Of 11 groups 
meeting these conditions, only 1 had 
an estimated p,, which was equal to or 
less extreme than mw. In the other 10 
groups, ~p. was always more extreme 
than 7. The differences were small, but 
they were all in the same direction. 

Of experiments containing relevant 
groups published since then, those by 
Gardner (1957), Cotton and Recht- 
schaffen (1958), and Nicks (1959) are 
inconsistent with PMH; those by Nei- 
mark (1956), Engler (1958), Neimark 
and Shuford (1959), and Rubinstein 
(1959) support PMH. No probability 
learning experiments (as here narrowly 
defined) reviewed in 1956 or published 
since then used more than 300 trials at a 
fixed probability except those by Gard- 
ner (1957), Cotton and Rechtschaffen 
(1958), and Nicks (1959), all three of 
which are inconsistent with PMH. 
Figure 1 indicates that in this experi- 
ment probabilities of choice were still 
becoming more extreme at Trial 300 and 
beyond. Longer experiments at fixed 
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m values might perhaps have produced 
fewer acceptances of PMH. 

Why did PMH, at best dubiously sup- 
ported by experimental data, achieve 
such widespread acceptance as a well- 
established truth? Three reasons seem 
plausible. First, it is a good first approxi- 
mation to the truth. It is more nearly 
correct than the assertion that p,, = 0.5 
for any value of 7, or that p, = 1 
whenever 7 is greater than 0.5. Further- 
more, it is predicted by some (not all) 
stochastic learning models, which them- 
selves are good first approximations to 
the truth. Secondly, few experiments 
have run enough trials to obtain a 
reasonable estimate of p,. Inclusion of 
trials on which ?, is still changing sub- 
stantially as a function of ¢ in estimates 
of p, will, of course, produce estimates 
of p, which are less extreme than they 
should be, and so come closer to sup- 
porting PMH than they should. (The 
use of cumulative relative frequency as 
an estimator of p,, as in Estes [1957], 
will of course bias the estimates in favor 
of PMH still more.) Finally, the custom 
of obtaining an estimate of p,, and testing 
the null hypothesis that that estimate 
is not significantly different from 7 is 
widespread in the probability learning 
literature (and was done in this paper). 
Such a procedure constitutes attempting 
to prove a null hypothesis; the smaller 
the amount of data or the greater its 
variability, the more likely it is that such 
a procedure will ‘‘confirm’’ PMH. This 
is why the small but consistent disagree- 
ments with PMH revealed by most 
probability learning experiments have 
not been noticed. 

The RELM rule.—The extreme-asymp- 
tote generalization is not very specific. 
The data from the previous experiment 
and from this one are consistent with a 
much more specific hypothesis called the 
Relative Expected Loss Minimization 
(RELM) rule (Edwards, 1956, pp. 182- 
185). That rule includes but goes beyond 
the extreme-asymptote generalization, 
and is applicable to a wide variety of 
experiments. For this kind of experi- 
ment, the linear form of that rule pre- 
dicts that p, = 0.5 + K(4r — 2), where 
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K is a fitted constant greater than 
0.25. The size of K presumably varies 
with motivational and other character- 
istics of the experimental design. <A 
least squares fit shows that for the data 
obtained in this experiment K = 0.395. 

Sequential effects—The surprise in this 
experiment is the weakness of the gam- 
bler’s fallacy found, and its disappear- 
ance in later trials. Nicks (1959), 
Anderson (1960), and Anderson and 
Whalen (1960) found much larger gam- 
bler’s fallacies in appropriate groups; in 
fact, Anderson found gambler’s fallacy 
effects even when his sequences were 
designed so that the probability of an 
outcome repetition was higher than it 
would have been had successive out- 
comes been independent. (Jarvik [1951] 
also found large gambler’s fallacies, but 
his experiment was so designed that they 


were not at all fallacious.) But this 


experiment does not stand alone; Feld- 
man (1959b) found no gambler’s fallacy 
at all in his 200-trial experiment. 

No real explanation of this divergence 
in presumably similar experiments is 


apparent. 
the 


It is possible, however, that 
relative inconvenience of the re- 
sponses in this experiment served to 
increase the monotony of what was in 
any case an exceedingly monotonous 
task. The gambler’s fallacy is in a sense 
a highly intellectual response. The S 
must have some idea of what probabili- 
ties are and also must to some degree 
keep track of several preceding outcomes 
in order to exhibit it. For this non- 
college population boredom may reduce 
the amount of intellectual effort applied 
to the task below the level necessary 
to sustain a gambler’s fallacy. 

The gambler’s fallacy is important 
because it is inconsistent with most rein- 
forcement theories. Bush and Morlock? 
have formulated a general conditioning 
axiom which in effect asserts that gam- 
bler’s fallacies cannot occur. They have 
proposed a procedure for examining run 
effects different from that used in Fig. 
4 and 5; they examine only the responses 
and outcomes included in outcome runs 


? Bush and Morlock, personal communi- 
cation. 
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of a specified length (or longer). These 
data were analysed by their method 
for Run Lengths 5 and 7. The results 
were essentially similar to those in Fig. 
4 and 5, but the greatly decreased num- 
ber of observations per point resulted 
in a considerable decrease in stability. 
The evidence about the general condi- 
tioning axiom from this experiment 
remains ambiguous. 
Hypothesis-testing behavior. 
and her collaborators (e.g., Goodnow, 
1955; Goodnow & Postman, 1955), 
Feldman (1959a), and I (Edwards, 1956) 
have argued that people base predictions 
in probability learning on local hypothe- 
ses about sequential dependencies. This 
idea is very attractive; the sequential 
effects examined in this paper make it 
more so. Unfortunately, many 
hypotheses (most necessarily incorrect) 
are possible, and they change too fast 
and too irregularly, to make this an easy 
idea to use. Feldman, working with 
verbal statements as well as predictions, 
has found it necessary to construct one 
hypothesis per S. This is the end point 
of any attempt to give a detailed, explicit 
account of probability learning from a 
hypothesis-testing point of view. We 
need higher order models, so that each 
specific set of hypotheses can be included 
within some more general classificatory 
or explanatory scheme. 
are available at present. 


Gor xdnow 


too 


No such models 


SUMMARY 


A probability learning experiment is re- 
ported in which each of 120 Ss made a se- 
quence of 1,000 predictions about which of 
two mutually exclusive events will occur 
After each prediction, one of the two events 
occurs; the probability of occurrence of each 
event is constant (0.5, 0.6, 0.7 and their 
mirror images). Sequences were randomized 
in two different ways. Forall relevant groups, 
the asymptotic probability of prediction was 
more extreme than the probability of occur- 
rence of the event predicted; probability 
matching did not occur. The Ss responded 
to small increases or decreases in the relative 
frequency of an event in a block of trials by 
similar small increases or decreases in their 
predictions of that event in that block; this 
phenomenon was named probability following. 
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Examination of sequential dependencies 
by means of information measures indicates 
that about 25% of response information can 
be accounted for by the identity of Ss and the 
results of the last three trials. The Ss ap- 
parently pay most attention to previous out- 
comes, and much less attention to previous 
responses. Most of the predicting is done by 
the immediately preceding trial; trials further 
back contribute only small amounts of 
additional transmitted information. 

Analyses of homogeneous outcome runs 
on later trials show that the longer the run of 
occurrences of an event, the more likely S 
is to predict that event. For early trials, 
however, Ss show a slight tendency to predict 
the event less often as its run length increases; 
this is the gambler’s fallacy. 
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AS RELATED TO INTELLIGENCE LEVEL! 
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According to S-R theory (Miller, 
1948), learning to respond with distinc- 
tive labels (cue-producing responses) 
to two similar stimulus situations 
should tend to increase the difference 
between them and thus facilitate sub- 
sequent discriminations. Increased 
differentiation based on this mecha- 
nism has been called “acquired dis- 
tinctiveness of cues’ (Miller, 1948, 
p. 174). On the other hand, if the 
individual learns to attach the same 
verbal response to two distinctive 
stimulus situations, the resulting re- 
sponse-produced cues (e.g., ‘“‘same 
name’’) give these situations a certain 
amount of learned equivalence. The 


transfer of an instrumental response 
from one such secondarily equivalent 
situation to the other is said to be 
mediated by the “acquired equiva- 


!This study was conducted while the 
author was on internship at Southbury Train- 
ing School, Southbury, Connecticut, during 
1959-60. The study was partially sup- 
ported by a grant (M-647) from the National 
Institute of Mental Health, National Insti- 
tutes of Health, Bethesda, Maryland; and 
grateful acknowledgment is made to Neal E. 
Miller, Yale University, for his aid and 
encouragement as research supervisor during 
the internship period, and for critical reading 
of drafts of the paper. The writer particu- 
larly wishes to express his appreciation to 
C. Edward Stull, Director of Psychological 
Service, and to the entire staff of the psychol- 
ogy department of Southbury Training School 
for their help and advice during the conduct 
of this study. Shirley Steele was especially 
helpful in screening the mongoloid Ss; and 
David Barron contributed greatly in pre- 
paring the geometric utilized as 
stimuli. 

2 Now at Tennessee Clover Bottom Home, 
Donelson, Tennessee. 


designs 


lence of cues”’ (Miller, 1948, p. 174) 
or secondary generalization. 

Again, current S-R_ formulations 
(Dollard & Miller, 1950; Miller, 1948) 
hold that it is primarily the verbal 
cue-producing responses which medi- 
ate man’s higher mental processes; 
and the absence or removal of these 
learned responses (such as occurs in 
repression) results in relatively unin- 
telligent behavior. 

The point is also made by these 
same theorists that the unavailability 
of learned verbal cue-producing re- 
sponses results not only in less intel- 
ligent behavior, but also in more 
primary stimulus generalization (re- 
sponse to superficial sensory simi- 
larity of stimuli which have been 
given otherwise distinctive character- 
istics in the form of different labels) 
where discrimination is required, and 
in less secondary stimulus generaliza- 
tion where the utilization of learned 
equivalence is required. 

The present study is designed to 
investigate the postulated relation- 
ship between verbal cue-producing 
responses, intelligent behavior, and 
primary vs. secondary stimulus gen- 
eralization. 

It was expected that when groups 
of Ss differing in intellectual ability 
are compared in a transfer situation 
under combined conditions of acquired 
distinctiveness and acquired equiva- 
lence of cues, both primary and 
secondary stimulus generalization may 
occur over the entire intellectual range 
studied. However, as one goes up 
the range, relatively more secondary 
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generalization responses should be 
found; and as one goes down the 
intellectual scale, relatively more re- 
sponses based on primary stim- 
ulus generalization (consequent to 
a relative inability to retain the 
basis for acquired distinctiveness and 
equivalence) should be obtained. 

The specific hypothesis to be tested 
was: There is a significant relationship 
between intelligence level and transfer 
task response, such that higher IQ 
groups tend to give relatively more 
responses based on secondary stimu- 
lus generalization than do compara- 
tively lower IQ groups. 


METHOD 
Subjects 


The S sample consisted of 48 mentally 
retarded and normal boys. Twenty Ss were 
of normal intelligence, ranging in IQ from 
91 to 109 (mean IQ = 99.8), as measured by 
the Otis Quick-Scoring Mental Ability Test. 
These Ss came from Grades 9 through 12 
at the Southbury High School,’ and they 
ranged in CA from 14-2 to 17-10 (mean 
CA = 16-4). The retarded Ss (20 high 
grade familials and 8 middle grade mon- 
goloids) were residents at Southbury Training 
School, a state institution for mental defec- 
tives. The high grade group ranged in IQ 
from 51 to 75 (mean IQ = 62.6) and in CA 
from 14-1 to 17-9 (mean CA = 16-1). With 
the exception of 3 Ss, who had been 
tested with the Wechsler Adult Intelligence 
Scale, the Wechsler-Bellevue Intelligence 
Scale, and the revised Stanford-Binet, Form 
L, respectively, IQs for the high grade sample 
were derived from the Wechsler Intelligence 
Scale for Children. The middle grade Ss 
had an IQ range of 26 to 37, with a mean 
IQ of 31.8 (revised Stanford-Binet, Form L), 
and they ranged in CA from 14-1 to 29-1 
(mean CA = 21-8). In the case of the 
latter group, it was necessary to go beyond 
the CA range used in the other groups in order 
to find mongoloids whose speech and compre- 


‘The author is grateful to Thomas J. 
Pepe, Superintendent of Schools, Southbury, 
Connecticut, and to James H. Heller, Direc- 
tor of Guidance, Southbury High School, for 
their generous assistance and full cooperation 
in providing these Ss. 
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hension ability were adequate for the required 
experimental tasks. 

In selecting all Ss for inclusion in the 
study, an effort was made to screen out chil- 
dren with debilitating physical and emotional 
defects. 


Materials and Procedure 


Two different sets of stimuli were em- 
ployed. One set consisted of four geometric 
designs (form stimuli), individually drawn 
on top of otherwise identical square box covers 
(Boxes 1, 2, 3, and 4). The designs, shown 
in Fig. 1, were modified from those used by 
Cantor and Hottel (1957). The other stimu- 
lus set (color stimuli) consisted of four box 
covers (Boxes 1A, 2A, 3A, and 4A) which 
were covered by colored construction paper. 
Boxes 1A and 2A were, light and dark blue, 
respectively ; Boxes 3A and 4A were light and 
dark red, respectively. Each of the eight 
stimulus boxes measured 3 X 3 X ? in. 

The normal and high grade retarded Ss 
received both sets of stimuli (Form and Color 
conditions) in succession. To control for 
the possible effect of order of stimulus presen- 
tation on performance in the pretraining and 
transfer situations, a random half of each of 
these groups was presented with the tasks 
involving Form first, the other half receiving 
the Color tasks first. 

However, on the basis of the performance 
of the above S groups with the Color stimuli, 
it was evident that the mongoloids would 
find coping with these stimuli extremely 
difficult, if not impossible. Therefore, the 








BOX 3 
Fic. 1. 


BOX 4 


Geometric designs used as 
form stimuli. 
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middle grade retardates were exposed to the 
Form conditions only. For all Ss, the follow- 
ing procedure was introduced as a 
learning game.’ 

Form conditions.—The pretraining situa 
tion consisted of two major phases: (a) four 
name-learning tasks (Cond. IA, IB, II, and 
Ill); and (b) a candy-finding task (Cond. 
IV). 

In the name-learning phase, each S was 
first required to learn to associate the name 
“bif’’ to Box 1 and the name “‘mot’’ to Box 2 
(Cond. IA). The S then learned to call Box 3 
bif and Box 4 mot (Cond. IB). Under each 
condition, S was shown the pair of boxes 
together and told the name that went with 
each of the boxes. The stimuli were then 
presented singly, in a predetermined random 
sequence. The S’s errors were corrected by 
E, and training was continued until S re- 
sponded correctly to five out of five consecu- 
tive presentations. When this criterion was 
reached, it was assumed that S had learned 
to distinguish Box 1 from Box 2 and Box 3 
from Box 4 according to the principle of 
acquired distinctiveness of cues. It was also 
assumed that, since S had apparently learned 
the same name (bif for Boxes 1 and 3 and 
mot for Boxes 2 and 4) for quite different 
stimuli, each given pair of cues had now 
acquired some degree of equivalence for S 

However, in order to make the foregoing 
assumptions more tenable, S was next pre- 
sented with a pack of 20 cards, made up of 
five replicas of each of the four box tops 
(Cond. II The S was told that the cards 
had the same designs and the same names as 
the boxes; that he was going to be shown the 
cards one at a time; and that each time, he 
was to tell E the name of the card. The 


““name- 


cards were shown single, in a predetermined 
nonalternating sequence. 
again corrected by E until S reached a cri- 


The S's errors were 


terion of 9 correct responses on any 10 
consecutive presentations. 

At this point, it was assumed that the 
dual conditions of acquired distinctiveness 
and equivalence of cues had been well estab- 
lished. Nevertheless, since the subsequent 
candy-finding and transfer tasks involved 
the boxes, rather than the cards, the preceding 
procedure was repeated with the four boxes 
(Cond. III). The S was instructed that he 
was now going to be shown the four boxes 
again and that he was to tell E the name 
of each box as it was presented. The boxes 
were shown, one at a time, in a predetermined 
random sequence, and S's errors were cor- 
rected until he spontaneously named the 
boxes correctly on eight consecutive trials. 
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Under Cond. IV, in pretraining, S 
required to learn to associate a manipulative 
response (reaching for and picking up one 
of a pair of boxes to find a candy reward) 
with a verbal cue-producing response, in- 
volving the name of the particular box 
covering the candy. The procedure for this 
condition was a modification of that used by 
Birge (1941). 

Multicolored M & M candies were used 
as reinforcement, and a random half of each 
group was trained to find the reward under 
Box 1 (bif), the other half learning to find 
the candy under Box 4 (mot In presenting 
the stimulus pairs to S, the left-right positions 
of the boxes were randomized 
from trial to trial. 
L-R sequence. 

The S was shown Boxes 1 and 4 (with 
relative L-R positions opposite to that in 
which they were subsequently to appear on 
the first learning trial) on a turntable, and 


E said: 


was 


for each S 
All Ss received the same 


This time, you are going to get some 
chances to find candy under one of these 
boxes. Each time, the candy will be under 
the bif (or mot). When I show you the 
boxes, I want you to say, “The candy is 
under the bif (or mot),”’ and then, pick up 
the bif (or mot), and take the candy. 
You may eat the candy now, or save it for 
later in this envelope (.S was given a small 
envelope). Be sure to get it right. First, 
you say, “The candy is under the bif (or 
mot),’’ and then, you pick up the bif (or 
mot) and take the candy. Ready? O.K 


For the first two learning trials, S was 
allowed to see the candy (a single M & M 
being placed under the appropriate box 
The turntable was then revolved, and, while 
it was still turning, it was hidden from S's 
view behind a cardboard screen. After the 
screen was in place, the turntable was stopped ; 
the boxes were moved to their predetermined 
L-R positions; and E asked: ‘‘Where is the 
candy?” On these trials, as on all subsequent 
ones, S was required to change incorrect 
verbal and/or manipulative responses before 
being allowed to take the candy. Thus, 
each trial ended with S being finally rewarded 
for a series of correct responses 

In an effort to insure that S's manipulative 
response would become associated primarily 
to the verbal response, the procedure was 
modified beginning with the third 
finding trial. Now the screen was placed 
between S and the revolving turntable before 
the candy was placed by E. In this way, S 
had no further opportunity to utilize possible 


candy ° 
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visual cues provided by E’s manipulation of 
the appropriate stimulus object; and trials 
were continued until S responded correctly 
without the candy-placement cue from E 
on five consecutive presentations of the 
paired stimuli. Thus, the lowest score any 
given S could achieve in Cond. IV was seven 
trials to criterion. 

When S had attained criterion on the 
preceding task, the transfer situation was 
introduced without a break in the procedure 
and without giving S any that 
changes were being made in the stimulus 
conditions. Having placed the screen, E 
removed Boxes 1 and 4 from the turntable 
(as had been done prior to all previous trials). 
The original stimuli were then replaced by 
Boxes 2 and 3, relative L-R positions of this 
pair having been randomized from S to S, 
and the candy was again placed under the 
box with the appropriate name. With the 
removal of the screen, S was again required 
to choose the box with the candy by giving 
the necessary verbal 
responses. 


warning 


and manipulative 

Each S was given only one transfer trial, 
and, depending upon whether he reached for 
and picked up the box having the same name 
as, or the box bearing the geometric figure 
similar to, that under which the candy was 
found in pretraining, S was scored as having 
chosen ‘‘Name’’ or ‘‘Form,”’ respectively. 
This score was taken as indicating whether 
the manipulative response had been mediated 
by secondary stimulus generalization, on the 
one hand, or primary stimulus generalization, 
on the other; and the number of Ss achieving 
the respective scores formed the basis for 
testing the experimental hypothesis. 

Color conditions.—The pretraining and 
transfer procedures (along with the appro- 
priate controls) involving the color stimuli 
paralleled the above methodology exactly. 
Boxes 1A, 2A, 3A, and 4A appeared in situa- 
tions analogous to those where Boxes 1, 2, 3, 
and 4 were used. However, in addition to 
differing from the form stimuli in physical 
appearance, the color stimuli 
different set of verbal labels. 

Briefly, during pretraining, S first learned 
to associate the name “‘dap’’ to Boxes 1A 
(light blue) and 3A (dark red), and the name 
“‘guz’’ to Boxes 2A (dark blue) and 4A (light 
red). He was then shown Boxes 1A and 4A 
and learned to pick up one of these for candy, 
at the same time reciting the cue-producing 
response: ‘‘The candy is under the dap (or 
guz).”’ On the transfer task, S was scored 
as having made a “Name” or ‘Color” 
response, and the statistical analysis was 


carried a 
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based on the number of Ss scoring in one 
category or the other. 

In an effort to minimize communication 
from those Ss who had already been seen to 
those yet to come, S was not released im- 
mediately after the main procedure was 
completed. Instead, S was shown four 
boxes, with the picture of a different animal 
(bison, tern, sheep, and owl) on each box, 
and he was told: ‘‘Now this is the really 
important part. Where is the candy? Pick 
up the box with the candy.”” Each box 
covered five pieces of candy. Thus, S was 
amply rewarded for whichever choice he 
made. ‘It was hoped that this addendum to 
the procedure would serve as an effective 
“‘smokescreen”’ to S’s memory of what had 
before. Nevertheless, before being 
allowed to leave, S was assured that E would 
appreciate it if the other boys were not told 
exactly what had taken place during the 
“came.” 


come 


RESULTS 


The statistical treatment was chiefly 
concerned with a chi square analysis 
of the data on the transfer task. 
Nevertheless, comparisons were also 
made between the various groups 
as to trials to criterion under both 
phases of the pretraining situation. 
Since the middle grade retardates had 
worked with the form stimuli only, 
statistical treatment of all the mon- 
goloid group data consisted of com- 
parisons of those data with responses 
made under the Form conditions by 
the other Form-first subgroups only. 

In all the following analyses (pre- 
training and transfer tasks), the .05 
level was used for determining sta- 
tistical significance, and two-tailed 
tests were employed throughout. 


Pretraining 


Name-learning tasks.—Each  S’s 
scores for Cond. IA, IB, II, and III 
were combined into a single trials- 
to-criterion score for the given S. 
The data for the normals and high 
grade retardates were analyzed by an 
analysis of variance design (Lind- 
quist Type III, 1953) involving two 
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retardates, 
Form vs. Color, Form-first vs. Color- 


variables (normals vs. 
first) in each of three respective di- 
mensions (intelligence level, stimulus 
condition, stimulus order) with the 
following results: (a) there were no 
significant interactions; (6) the re- 
tardates took significantly more trials 
to criterion than did the 
normals—indicating that name learn- 
ing was, in general, more difficult for 
the retarded Ss; (c) both groups took 
significantly more trials to criterion 
with Color than with Form; and (d) 
the effect of stimulus order was not 
significant. 

Individual ¢ comparisons of 
the middle grade retarded group with 
‘the other Form-first subgroups indi- 
cated that the normals learned the 
names of the form stimuli in signifi- 
cantly fewer trials than did the 
mongoloids. However, the high grade 
and middle grade retardates did not 
differ significantly from each other 
in the mean number of trials needed 
to learn the names of the geometric 
designs. 

Candy-finding task.—The trials to 
criterion were analysed for the data 
obtained in Cond. IV, with the 
following findings: 


over-all 


test 


1. In comparing the normals and 
high grade retardates, it was found 
that: (a) There were no significant 
interactions. (b) There was no signifi- 
cant difference in mean number of 
trials to criterion under the different 
stimulus conditions. From this, it 
may be inferred that once the names 
had been learned to both sets of 
stimuli, learning to associate a manip- 
ulative response to those names was, 
in general, no more difficult for Color 
than for Form. (c)There was a sig- 
nificant over-all difference between 
the IQ groups—the retardates ap- 
parently having more difficulty than 
the normals in learning the appro- 
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The effect 
significant. 


priate associations. (d) 
of stimulus order 
The Color-first group as a_ whole 
(combined Color-first normals and 
Color-first retardates) took more trials 
than did the Form-first 
whole. 


was 


group as a 


2. The ¢t test comparisons between 
the middle grade group and the other 
Form-first subgroups revealed that 
the mongoloids apparently did not 
differ significantly from either of the 
other groups in the mean number of 
trials needed to learn the appropriate 
associations. 


Transfer 


Order of presentation.—The effect 
of stimulus order on Ss’ transfer-task 
responses was examined by a series of 
chi square comparisons of noninde- 
pendent proportions (McNemar, 
1955). In this way, tests were made 
of the significance of overall changes 
from the first to the second set of 
responses (where both sets of stimulus 
conditions had been administered) 
in the various IQ groups and stimulus 
order subgroups. All comparisons 
involved a correction for continuity, 
according to the formula provided by 
McNemar (1955, p. 231) for correcting 
chi squares computed from fourfold 
tables in which the frequencies or 
proportions involved are based on the 
same Ss. In all cases, the hypothesis 
was that the proportion of Ss giving 
a ‘‘Name”’ response on the first trans- 
fer task was equal to the proportion 
giving that response on the second 
transfer task. Table 1 presents a 
summary of the results of the various 
tests of that hypothesis. 

The nonsignificant chi squares in 
Table 1 indicate that, regardless of 
which stimulus condition was pre- 
sented first, both IQ groups were 
consistent in their responses from 
the first transfer task to the second; 
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rABLE 1 


TESTS FOR SIGNIFICANCE OF CHANGES FROM 
First SET vs. SECOND SET OF RE- 
SPONSES WITHIN GIVEN GROUPS 
AND SUBGROUPS 


Group j Chi 
Square 


Normals 

Retardates 
Form-first Normals 
Form-first Retardates 
Color-first Normals 
Color-first Retardates 


<1.00 
<1.00 
< 1.00 

3.20 
< 1.00 
<1.00 


’ 


i.e., the tendency to give a ‘Name’ 
response was not significantly affected 
by stimulus order. 

The experimental hypothesis. 
there was no significant order effect, 
the chi square comparisons of the 
normals vs. high grade retardates 
were conducted without taking stim- 
ulus order into account. In the case 
of these groups, the transfer task data 
were examined separately for Form 
and Color conditions, with the fol- 
lowing findings: (a) Under the Form 
conditions, 20 normals and 13 re- 
tardates gave the ‘‘Name’”’ response. 
None of the normals and 7 retardates 
chose ‘‘Form.”’ (x? = 6.23, corrected 
for continuity, with P = .02.) (0) 
Under the Color conditions, 16 nor- 
mals and 5 retardates made the 
‘‘Name”’ response; while 4 normals 
and 15 retardates chose ‘‘Color.” 
(x? = 12.13, P = .001.) 

As with previous analyses, the 
responses of the middle grade re- 
tardates were compared only with 
those made under the Form condi- 
tions by the other Form-first sub- 
groups. These comparisons revealed 
the following: (a) Of the 10 Form- 
first retardates, 6 Ss gave the ‘‘Name”’ 
response; while 4 Ss chose ‘‘Form.” 
In the middle grade group, none of 
the Ss chose “Name,” all 8 mon- 
goloids made the ‘‘Form”’ response. 


Since 


(x? = 4.80, corrected for continuity, 
P = .05.) (6b) Among the Form-first 
normals, all 10 Ss made the “Name” 
response, with none choosing ‘‘Form.”’ 
Comparison of this group with the 
middle grade retardates yielded a 
x? of 14.10 (corrected for continuity), 
with P = .001. 


DISCUSSION 


The findings clearly indicate that the 
experimental hypothesis was confirmed. 
Thus, the results on the transfer tasks 
seem to validate the proposed theoretical 
relationship between intelligent behavior, 
verbal cue-producing responses, and pri- 
mary vs. secondary stimulus generaliza- 
tion. 

In general, the findings of the present 
study are consistent with those of other 
research (e.g., Birge, 1941; Cantor & 
Hottel, 1957; Golu, 1958). 

Birge investigated the acquired equiva- 
lence of cues formulation with children of 
normal intelligence. She found that 


generalization of a manipulative response 


from one stimulus to another 
facilitated when the children had pre- 
viously learned to give the same names 
to the two very different stimuli. The 
study by Cantor and Hottel was con- 
cerned with the applicability of the 
acquired distinctiveness of cues formu- 
lation to the learning behavior of mental 
defectives. The obtained results indi- 
cated that teaching the retardate differ- 
ent names for similar stimuli aided the 
learning of a differential motor response 
to the stimuli. Golu, in a study sum- 
marized by Brackbill (1960), found that 
a differential verbal response, in addition 
to a motor response, decreased the num- 
ber of errors in discrimination learning 
in mentally defective Russian children. 
In addition to being consistent with 
the cited research, the present investiga- 
tion goes further by pointing up the 
differential role played by verbal cue- 
producing responses in the higher mental 
processes—as well as the 
of the context within 
processes are operating. 


was 


importance 


which these 
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The nature of the present stimulus con- 
ditions was such that the similar geo- 
metric figures were apparently more 
distinctive than were the similar colors. 
Nevertheless, the results indicate that 
the normal group showed relatively 
greater ability to utilize verbal cues for 
secondary generalization (i.e., demon- 
strated a relatively higher level of rea- 
ability) under both stimulus 
conditions than did the high grade 
retardates. However, a comparison of 
the responses made by the latter group 
under Form and Color conditions, re- 
spectively, indicates that, for this group, 
the effectiveness of verbal cues in mediat- 


soning 


ing secondary generalization was related 
to the relative distinctiveness of the 
form vs. the color stimuli. On the other 
hand, the comparison of the middle 
grade retardates with the other Form- 
first subgroups seems to emphasize that 
the mongoloids were comparatively even 
more limited to responding in terms of 
the primary, ‘“‘given’’ aspects of the 
stimulus conditions—an apparent indi- 
cation of very low-level reasoning ability. 

It should be pointed out that although 
all the mongoloid Ss gave the ‘“‘Form”’ 
response, 3 of these Ss had appar- 
ently been able to apply the correct name 
to the manipulated stimulus, even though 
they did not use this name to mediate 
transfer. For example, if they had been 
saying: “The candy is under the bif 
(cross)”’ in pretraining, when the boxes 
were switched for the transfer situation, 
these Ss said: ‘‘The candy is under the 
mot (different cross),”” and they picked 
up the mot. These Ss picked up the 
physically more similar box in spite of 
the fact that they correctly gave it the 
different name (which had, for them, 
never been associated with finding the 
candy). Thus, for these Ss, the “in- 
correct’’ manipulative response on the 
transfer task was made in spite of a 
correct verbal label. 

However, all of the other Ss, in all 
IO groups, who transferred on the basis 
of physica! similarity (primary stimulus 
generalization) not only reached for the 
“wrong” box, but also gave it the wrong 
name. These Ss first gave the correct 
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verbal response (e.g., “‘The candy is 
under the bif’’)—following which, they 
picked up the box which had previously 
been labeled mot. This behavior may 
have involved at least two mechanisms, 
which are not mutually exclusive: (a) 
The training with the candy attached 
a very strong reaching response to the 
specific cue on the box. This response 
generalized to the similar box and medi- 
ated the transfer of the name which was 
wrong for that box. (6) The training 
with candy so strengthened the tendency 
to respond to the specific cue on the box 
with the name of that box that the 
generalization of this name to the phys- 
ically similar transfer box was able to 
override the previous training to give a 
different name to the transfer box. 

In any event, the previous training in 
naming did not mediate secondary gen- 
eralization for those Ss who picked up 
the “incorrect”’ box. 

As expected, the proportion of Ss 
who failed to show secondary generaliza- 
tion on the basis of the verbal cues was 
relatively greater in the comparatively 
lower IQ groups. All the Ss in the lowest 
group showed such failure. Neverthe- 
less, the fact that some of the mongoloids 
were able to respond (albeit ‘“‘incor- 
rectly’’) on the basis of the correct verbal 
label suggests that the ability of this 
group of mental defectives to utilize 
verbal symbols in learning and perform- 
ance may possibly be greatly improved 
by verbal training. 


SUMMARY 


The present study was concerned with the 
relationship between verbal cue-producing 
responses, intelligent behavior, and primary 
vs. secondary stimulus generalization. Un- 
der combined conditions of acquired distinc- 
tiveness and acquired equivalence of cues, 
groups of intellectually normal, high grade 
retarded, and middle grade retarded (mon- 
goloid) boys were compared as to the utiliza- 
tion of learned verbal responses in the transfer 


of a manipulative response. As predicted, 


there” was a significant relationship between 
1Q level and transfer score, indicating that 
the relatively higher IQ groups responded 
significantly more on the basis of secondary 
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stimulus generalization than did the relatively 
lower IQ groups. 
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Recent investigations, neurophysio- 
logical as well as psychophysical, have 
shown that temperature rather than 
temperature gradient should be con- 
sidered as the adequate stimulus of 
the warmth sense organ (Hensel, 
1950; Hensel & Zotterman, 1951a; 
Vendrik & Vos, 1958). However, it 
is clear that the time dependence of 
temperature also is important for the 
stimulation of this sense organ. Adap- 
tation of the skin senses in general 
is very marked. 

The term adaptation is often em- 
ployed in a variety of ways. In this 
paper it will used the sense 
that the response to a constant physi- 
cal stimulus diminishes with time. 
Hence, adaptation is a special type of 
dynamic behavior this sytem. 
The response may be nervous activity 
measured electrophysiologically or 
sensation. A quantitative descrip- 
tion of adaptation needs the measure- 
ment of the response to a physical 
stimulus with a well known time 
course. The measurement of the 
magnitude of a sensation, however, 
gives large difficulties. Therefore 
in psychophysical experiments the 
magnitudes of physical stimuli with 
various time courses are measured 
which give rise to a threshold sensa- 
tion or another sensation of constant 
magnitude. These results allow the 
dynamic behavior to be calculated. 

The comparison of electrophysio- 
logical and psychophysical data is very 
important. It allows one to interpret 
the electrophysiological responses in 
terms of their functional significance 
and to explain the psychophysically 
determined results. But this com- 


be in 


ol 


403 


parison is only possible when both 
responses are measured quantitatively. 

Electrophysiological experiments on 
the sensory nerves of the temperature 
sense organs of the cat have revealed 
that the adaptation of the cold and 
warmth receptors in this animal is 
very pronounced but not complete 
(Hensel, 1953). The time-constant 
of this receptor adaptation has been 
measured. The adaptation of the 
sensation in man, however, is com- 
plete if the applied change 
perature is not too large. This points 
to another adaptation originating 
more centrally than the one of the 
receptors. 

In a previous paper by Vendrik 
and Vos (1958) it shown that 
adaptation of warmth sensation be- 
comes very marked for stimulus 
durations greater than a few seconds. 
This adaptation does not agree quan- 
titatively with the receptor adapta- 
tion measured electrophysiologically 
in the cat. The experiments 
Vendrik and Vos (1958), 


in tem- 


was 


by 
however, 


were not designed for the investiga- 


tion of this problem. The main 
limitation was that the durations of 
the stimuli could not be made short 
enough. Therefore the threshold 
measurements of the warmth sensa- 
tion described in this paper were 
carried out with the aim to obtain 
more quantitative information on the 
adaptation, ie., on the 


behavior, of this sense organ. 


dynamic 


METHOD 


For testing dynamic behavior it is neces- 
sary to use stimuli of quantitatively well 
known time course. Moreover, it was desir- 
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able to use stimuli of considerable 


duration than 1 sec. 


shorter 


The dependence of temperature with time 
at the receptor layer of the skin is much more 
accurately known when microwave stimula- 
tion is applied than when infrared radiation 
or thermodes are used. The reason is that 
the absorption coefficient of microwave radia- 
tion of suitable wavelength in water-like 
tissues is so small that the skin above’ the fat 
layer is practically uniformly raised in tem- 
perature. Consequently heat conduction dur- 
ing short exposures does not play an important 
role. With infrared radiation or thermodes, 
by contrast, the rate of change of temperature 
in different skin layers is largely determined 
by heat conduction. 

In present experiments microwave radia- 
tion with a free space wavelength of 10 cm 
has been used to irradiate the inntr side of the 
forearm. The arm was gently pressed against 
the aperture of a rectangular wave guide with 
a cross section of 7.2 X 3.6 cm*. Also experi- 
ments with cross sections of 2.5 K 7.2 cm.? 
and of 6.0 X 7.2 cm.? were done. Greater 
variations in the cross-sectional area of the 
aperture of the waveguide cause reflections 
of the microwave to such an extent that a 
mismatch cannot easily be prevented. The 
equipment, its calibration and its use have 
been previously described (Vendrik & Vos, 
1958). 

When the arm is exposed to microwave 
radiation, a constant energy per unit of time 
is transmitted into the arm. During the 
time that heat loss to surrounding tissues is 
negligible, the temperature rise will be linear 
with time. From measured temperature 
rises on the surface of the skin, as well as 
from calculations, it appears that increase of 
temperature is linear within about 10% when 
exposure times do not exceed 7 sec. (Eijkman, 
1959). 

The experiments consist of measurements 
of threshold energy at different exposure 
times. The threshold measurements have 
been made using the yes-no procedure. The 
S was asked to make a decision whether or 
not a stimulus was presented to him during 
an indicated interval of time. A judgment on 
the existence of a stimulus is based on a neural 
activity that is probably fluctuating. To make 
a choice out of a fluctuating activity is in fact 
a problem of detection, recently investigated 
by Tanner and Swets (1954). 

It is important to keep the Ss uncertain 
about the fact whether a stimulus really 
exists. The experiments were conducted 
in such a way that S could freely decide when 
he wanted to make an observation but leaving 
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him in doubt about the occurrence of a “‘zero”’ 
stimulus or a “real” stimulus. All i 
consisted of 10 trials, which by means of a 
programing switch, were composed of 5 
stimuli of an announced strength and 5 “zero” 
stimuli in random sequence. Mostly well- 
trained Ss were tested, who thus knew per- 
fectly well, what a particular strength of 
stimulus meant. While S under test was 
observing the stimuli of the preset program, 
which was, of course, unknown to him, he 
wrote down the program he experienced. 
The possibility is now included, that a record 
of sensation is made, while no stimulus 
(i.e., a “zero’’ stimulus) was present, a so- 
called false positive. This gives the oppor- 
tunity to describe the reliability of a normal 
report of sensation while a stimulus is present. 
The probability of perception is defined as 
the probability of report of sensation minus 
the probability of a false positive. This 
definition is chosen because this gives the best 
correction we can make at the moment for 
the variations in the observer's attitude. If 
one adheres to threshold theory, this definition 
gives the same result as the well-known 
definition used in threshold theory as regards 
the course of the curves describing the dy- 
namic behavior of this organ. If one thinks, 
as we do, that detection theory is a better 
model, evaluation of the probability of ob- 
taining a false positive can only be given if the 
distribution and the bandwidth of the noise 
is known. This knowledge, however, is lack- 
ing. Although not ideal the above mentioned 
definition provides a good correction for the 
shift in the observer's detection level. 

The curve presenting the probability of 
perception versus the intensity of the stimu- 
lus is S shaped. The intensities are chosen 
in the region of the greatest slope of the S 
curve. Every experimental probability curve 
includes four intensities with the same ex- 
posure times; at each intensity six or seven 
series: of 10 trials were performed. Eight 
different exposure times were used ranging 
from 0.1 to 10sec. The Ss were tested during 
about 3 weeks successively. 

The threshold finally is defined as the 
energy that yields a probability of perception 
of .85, this being the percentage that could 
be determined most accurately. 
as well as exposure 
electronically. 


series 


Intensities 
times were controlled 


RESULTS 


An example of a measured ‘‘prob- 
ability of perception” curve is shown 


in Fig. 1. The stimulus which is 
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Fic. 1. Probability of perception versus 
total energy applied in one exposure of con- 
stant duration ¢,. (The probability is cor- 
rected for false positives. The probability 
value 0.85 renders the threshold energy.) 


plotted along the abscissa is the total 
energy given in one exposure. This 
quantity is in good approximation 
proportional to the temperature in- 
crease of the receptors at the end of 
the exposure. From these curves the 
thresholds defined the energy 
necessary for a probability of percep- 


as 


energy in one exposure 
(cal) 


microwave / 


subject R 

area 3,6 x 7,2 cm? 

B/A = 2,0 

tT, = 0,25 sec Pa 
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10 


exposure time te (sec) 


Fic. 2. 


The threshold energy as a function of the exposure time for 2 Ss. 
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tion of .85 are obtained. As all experi- 
mental probability curves showed 
the same shape, a standard curve was 
fitted to the measured probabilities. 
This could be done with a standard 
error of 15%; thresholds could thus 
be determined with an accuracy of 
15%. 

In Fig. 2 the thresholds of 2 well 
trained Ss are plotted as a function 
of the exposure time. Similar thresh- 
old measurements with reduced and 
increased area of irradiation (2.57.6 
cm.? and 6.0 XK 7.6 cm.?) yield the 
same dependence on exposure time 
as the one shown in Fig. 2. 

The characteristic features of the 
measured curves are: (a) Threshold 
energy increases initially with expo- 
sure times up to about 1-2 sec., 
which means that with a steeper rise 
of temperature a threshold sensation 
is obtained at a_ smaller 
of temperature. (b) For 


increase 
exposure 
energy in one exposure 
(cal) 

microwave 
subject E ~ 

area 3.6x 7,2 cm? 
B/A = 2,0 

T, = 0,25 sec 











6 8 10 


exposure time t. (sec) 


(The dots repre- 


sent experimentally determined values of threshold energy with their standard errors indicated 
by vertical lines; the solid line represents the theoretical curve of peripheral adaptation; and 
the broken line represents the theoretical curve of central adaptation.) 
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times larger than about 1 sec. the 
curve tends to run more horizontally. 
Threshold temperature is then much 
less dependent on the rate of change 
of temperature. (c) Finally, thresh- 
old energy increases proportionally 
with exposure time, the threshold 
rate of temperature change being 
constant. 


DISCUSSION 


In the first place, one can think of 
the dynamic behavior of the receptors 
as a possible explanation of the experi- 
mental results. Hensel and Zotterman 
(1951b) have investigated electrophysio- 
logically the dynamic behavior of the 
cold and warmth receptors in the cat. 
After a sudden change of temperature, 
the temperature fibres respond with an 
initially considerable change of activity 
(overshoot) dropping to a new stationary 
value (Fig. 3). The quantities A and B 
(Fig. 3) of the temperature receptors 
mostly found by Hensel and Zotterman 
have negative values. This means that 
activity diminishes at higher tempera- 
tures in a limited range of skin tempera- 
tures. In our discussion it is not relevant 
whether A and B are negative or positive. 

Mathematically the change of the 
activity of a group of receptors after a 
steplike change of temperature can be 
described as: 


f, = ATo + BTve [1] 





temperature 


| 














frequency 
of spikes 


| 

















—- time 


¥ Fic. 3. Dynamic behavior of temperature 
fibres in cat and dog found in electrophysio- 
logical studies by Hensel and Zotterman 
(1951b). 
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where 


(ees 


sec. 


f, = change of neural activity 


), To = steplike change of tem- 


B = constants of 
number of ed) 


perature (°C.), A an 

the receptor system ( — 
sec. “CL. 

T, = time-constant of the receptor sys- 

tem (sec.), and ¢ = time after onset of 

step (sec.). 

Hensel (1953) gives another expression 
with two time-constants. However, 
the influence of the second time-constant 
(0.01 sec.) is quantitatively negligible, 
and we do not think that experimental 
evidence thus far justifies the assumption 
of this second time-constant. 

As will be seen, the above formula 
describes neural activity changing with 
regard to an already existing activity, 
which is not taken into account. In 
other words activity is measured count- 
ing from a physiological ‘‘zero."” Na- 
turally, the assumption is made, that 
the receptor system is working linearly 
for the small temperature rises needed 
to measure thresholds. 

The discrete values of the neural 
activity which is the result of the all-or- 
none mechanism of the nerve, is not 
considered an objection to the proposed 
description with a continuous character. 
For in the statistics of many receptors 
and many trials eventual discrete values 
of activity of single neurons will dis- 
appear. If the response on a steplike 
change in temperature is according to 
Formula 1 the neural activity caused 
by a linear increase in temperature, as 
used by us, is: 


fr = aAt + aBr,(1 — e~“/1) 


with T = at, 
a=rate of 


(2] 
(°C.), 
temperature 


T = temperature 
change of 


( - ), and ¢t = time after onset of the 
sec. 


stimulus. 

Assuming that the activity f, has to 
equal or to surpass the value f,, during 
the exposure time ¢,, to give rise to 
perception, the following expression is 
obtained for the threshold as a function 
of exposure time: 


f. = aAt, + aBri(l — e4/") =[3] 
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fibres 
(The 
curve is a consequence of the response shown 
in Fig. 3.) 


Fic. 4. Response of temperature 


to a linearly changing temperature. 


This formula, which is suggested by 
the electrophysiological measurements, 
appears to describe the experimental 
results for exposure times smaller than 
about 3 sec. very well. The best fit to 
the experimental results is obtained by 
taking B/A = 2.0 and 7; = 0.25 sec. 
(see Fig. 2). The same quantities of all 
experiments (3 Ss and three different 
areas) fall in the range B/A = 2-4 and 
In electrophysiological 
experiments the cold receptors in the 
tongue of the cat appeared to have a 
time-constant in the range 0.3-2.2 sec. 
(Hensel, 1953). Records of the activity 
of warmth fibres dynamic re- 
sponse that might be described with a 
tiime-constant somewhat smaller than 
found for cold receptors. So the values 
of 7; in our experiments are in the range 
of comparable electrophysiological data. 
The same can be said about the values 
of B/A. The mathematical description 
of dynamic properties given above ap- 
pears to be applicable to elec trophysio- 
logical data and to psychophysical 
experiments with exposure times smaller 
than 3 sec. 


) 
T, = 0.2-0.4 sec. 


show a 


However, if times exceed 
2-4 sec., the experimentally determined 
threshold . energy deviates from the 
theoretical model described above. This 
means that for times larger 
than 2-4 sec. rate of 
temperature determines the percepti- 
bility. If the peripheral adaptation 
model is accepted which is based on the 


exposure 


exposure 


the change of 
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electrophysiological studies and _ con- 
firmed by our psychophysical results a 
nonperipheral adaptation must be as- 
sumed in addition which deals withthe 
stationary activity of the receptors. 
Such an adaptation which we will call 
central adaptation, has been shown to 
exist. Landgren (1957) measured the 
cortical receptors of cold impulses from 
the tongue of the cat. His registrations 
show a marked adaptation. But as the 
corresponding activity in the peripheral 
nerves was not measured a quantitative 
relationship between peripheral and cor- 
tical activity is not Electro- 
physiological data of central adaptation 
which can be compared with our psycho 
physical results are lacking. 

A quantitative description, therefore, 
can only be based on our own experi- 
mental results. 
that 
nervous 


known. 


It is perhaps plausible 
to assume the the 
central system to a constant 
input, i.e., peripheral nervous activity, 
decreases exponentially with time. It 
appears that the experimental curve is 
not fitted particularly well by this model. 
A better agreement is reached when the 
assumption is made that central adapta- 
tion occurs after a retardation time. 
This retardation time turns out to be 
2-4 sec. More definite conclusions about 
this central adaptation must wait till 
the adequate electrophysiological experi- 
ments have been carried out. These 
experiments are now in preparation in 
this laboratory. 


response of 


Similar studies about adaptation of 
the touch sense have been performed by 
us. Also electrophysiological 
ments of the 


experi- 
response of touch nerve 
fibres in the cat have been performed. 
A very satisfactory agreement between 
psychophysical and electrophysiological 
results have been obtained (Eijkman & 


Vendrik, 1960). 


SUMMARY 


Psychophysical experiments are described, 
which were carried out with the aim to in- 
vestigate adaptation of the warmth 
Microwave radiation was used 
as a warmth stimulus, giving a relatively 


the 


sensation, 
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accurate knowledge of the time course of the 
temperature in the deeper layers of the skin. 

The dependence of the threshold of the 
warmth sense on the time of exposure shows 
adaptational effects, which are in agreement 
with electrophysiological evidence. Hence, 
for exposure times up to 2-4 sec. the experi- 
mental results are described by a receptor 
system having a time-constant of about 
0.3 sec. 

Exposure times exceeding 2-4 sec. display 
another adaptation, which is believed to be 
of central origin. The obtained curves suggest 
that after the onset of the stimulus this 
central adaptation starts to diminish the 
effect of peripheral inflow after a retardation 
time of a few seconds. 
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STRENGTH OF A GENERALIZED CONDITIONED 
REINFORCER AS A FUNCTION OF 
VARIABILITY OF REWARD! 


RICHARD A. WUNDERLICH ? 


John Hopkins University 


Recently Skinner (1957) has pro- 
posed that a stimulus which has been 
associated with more than one pri- 
mary reinforcer will acquire the 
properties of a generalized conditioned 
reinforcer (GCR). A GCR, as pro- 
posed by Skinner, is a reinforcing 
state of affairs whose “reinforcing 
properties” are effective over a wide 
range of situations. It differs from a 
simple conditioned reinforcer (SCR) 
which controls behavior only under 
appropriate conditions of deprivation 
in not being dependent upon momen- 
tary deprivational conditions. Also, 
a GCR is assumed to have greater 
potential reward value than a SCR. 
Finally, Deese (1952) has pointed 
out that a habit based on a GCR 
should be more resistant to extinction 
than one based on a single reward. 

Only a few studies (e.g., Porter & 
Miller, 1957; Wike & Barrientos, 
1958; Wike & McNamara, 1955) 
have investigated the properties of a 
GCR, and these have been concerned 
primarily with the role that different 
drive conditions might have in the 


1The experiment reported here was sub- 
mitted as a dissertation in partial fulfillment 
of the requirements for the PhD degree at 
the Johns Hopkins University. The author 
wishes to express his sincere gratitude to 
Stewart H. Hulse who acted as faculty 
advisor to this dissertation. His suggestions 
and criticisms throughout all phases of the 
experiment were invaluable. 
also grateful to Leon S 
reading the manuscript. 

2 Now at the Department of Psychology 
and Psychiatry, Catholic University of 
America, Washington 17, D.C, 


rhe author is 
Otis for critically 


development of aGCR. For example, 
Wike and McNamara (1958) paired 
the stimuli of one goalbox at the end 
of an alley with wet-mash reinforce- 
ment of hunger and thirst drives in 
rats and the stimuli of another goal- 
box with wet-mash reinforcement of 
a hunger drive during the first phase 
of an experiment. Later, when 
confronted with both goalboxes in 
a T maze test, Ss preferred the 
goalbox previously associated with 
reinforcement under both drives. 
While the present literature sug- 
that conditions of drive are 
important variables in the determina- 


gests 


tion of a GCR, studies examining 
other properties, such as the nature 


of the reinforcing situation itself, 
are lacking. The present experiment 
investigates the effects of varying 
(a) the type of reinforcement (food, 
water, or both) and (6) the distribu- 
tion of the types of reinforcement 
during training trials, on the strength 
of a GCR as measured by resistance 
to extinction of a running response. 
Relative preference for the food and 
water rewards and spontaneous re- 
covery were also used as _ supple- 
mentary measures in the study. 
Reward preference was used as a 
measure of reinforcement ‘“‘strength”’ ; 
spontaneous recovery, as 
measure of a GCR. 


another 


METHOD 
Subjects 


Sixty male albino rats of the Sprague- 
Dawley strain obtained from Sprague-Dawley 
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Inc., Madison, Wis., served as Ss in the experi- 
ment. There were 15 Ss in each of four 
The Ss were about 95 days old at 
the beginning of conditioning. 


groups. 


Apparatus 


The apparatus was similar to that de- 
scribed by Hulse (1958). It consisted of a 
straight alley with start and goalboxes. The 
overall inner dimensions of the alley were 48 
x49 in.; the startbox, 7 X 6 X9 in.; 
the goalbox, 12 X 8 X 9 in. The start and 
goalboxes were covered with hinged glass 
lids while the alley was covered with a hinged, 
screened frame. The interior of the startbox 
was white; that of the alley, flat black. The 
interior of the goalbox was also black with 
the exception of the back wall which was white. 
[wo reward cups approximately 1.75 in. 
in diameter were mounted on the goalbox 
floor near the back wall. One reward cup 
was used for food reinforcement; the other, 
for water reinforcement. 

Manually operated guillotine doors sepa- 
rated the start and goalboxes from the alley. 
Raising the door of the startbox started a 
Standard Electric timer. The timer was 
stopped when the weight of S depressed a 
floor panel in the goalbox. The relay circuits 
and timers were enclosed in a sound-damping 
box. 

A light-gray box 12 XK 10 X 10. in. was 
used to confine Ss during a 15-sec. intertrial 
interval. The interval was measured by a 
timer which was started as S was placed in 
the box. 


Experimental Design 


Four groups of Ss were run under both 
hunger and thirst motivation. The groups 
differed only in the type of reward that they 
received during conditioning trials. Group F 
received a food reward on all trials; Group W 
received a water reward on all trials; Group 
F + W received both food and water on all 
trials; and Group F/W received food ran- 
domly on half the trials and water on the 
remaining half. Three Ss of each group were 
run during each of five replications of the 
experiment. One week separated the be- 
ginning of training trials for 
replications. 


successive 


Procedure 


Taming.—U pon arrival from the supplier, 
all Ss were permitted ad libitum food and 
water fora period of 3 days. A 16-day period 


of taming followed. During the first 14 days, 
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a deprivation schedule, which permitted an 
average daily food ration of 8 gm. of Purina 
chow and 13 cc. water, was instituted. Each 
day Ss were taken from their housing cage 
and successively placed in groups of 6 in 
four different boxes. In the first box, they 
found one-half of their daily ration of food, 
in the second box, one-half their daily ration 
of water, in the third box the remaining half 
of their food, and in the fourth box, the 
remaining half of their water. 

A 2-day exploration period followed in 
which Ss in groups of 3 were permitted 
to explore the alley for two periods each day. 
Each S of Group F received approximately 
0.50 gm. of dry food (0.045 gm/pellet)® in 
the goalbox during each of the two explora- 
tion periods. Each S of Group W received 
0.50 cc. water during each of the exploration 
periods, while Ss of Group F + W received 
both 0.25 gm. food and 0.25 cc. water on each 
period. Group F/W received 0.50 cc. water 
on the first period and 0.50 gm. food on the 
second period of the first exploration day; 
this order was reversed on the second explora- 
tion day. As previously mentioned, food 
rewards were always found in one cup in the 
goalbox while water rewards were found in 
the second cup. 

Supplementary rations were given im- 
mediately after the exploration periods. 
These rations brought the average daily food 
consumption to 8 gm. and the average daily 
water ingestion to 13 cc. per S. 

Conditioning.—All Ss received 50 condi- 
tioning trials, 10 trials on each of 5 con- 
secutive days. The intertrial intervals were 
15 sec. Each S of Group F received two dry 
0.045-gm. food pellets per trial as a reward. 
Each S of Group W received two drops of 
water (0.10 cc.) per trial as a reward. Each 
S of Group F + W received one food pellet 
(0.045 gm.) and one drop of water (0.05 cc.) 
per trial. Each S of Group F/W received 
as a reward either two food pellets or two 
drops of water on any onetrial. The sequence 
of food or water for Group F/W was randomly 
determined with the restrictions that half 
the trials were water reinforced, that the 
initial trial of each day was water reinforced, 
and that no single kind of reinforcement 
occurred more than three times in succession. 
All Ss of this group followed one of three 
reinforcement schedules. 

As in the exploration periods, supplemen- 
tary rations were given approximately 1 hr. 
after each daily These 

§ Obtained from P. J. Noyes Co., Lancaster, 
New Hampshire. 
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brought the total daily food and water con- 
sumption to 8 gm. of food and 13 cc. of water 
Supplementary rations were given to each S 
individually in separate cages 

Extinction. conditioning, 20 
extinction trials were given. Ten extinction 
trials were given on the first day and 10 trials 
on the following day. During extinction, Ss 
were permitted to remain in the empty goal- 
box for 5 sec. on each trial. The interval 
between extinction trials was 15 sec 


Following 


Performance Measures 


During conditioning and extinction, the 
total running time per trial was recorded to 
the nearest .1 sec. Total running time was 
defined as the time from the raising of the 
startbox door to the time S entered the goal- 
box and depressed the floor panel. These 
measures were converted to reciprocals and 
multiplied by 100 for purposes of statistical 
treatment. 

As a second measure of performance, 
spontaneous recovery occurring between the 
2 extinction days was examined. To check 
for the statistical significance of spontaneous 
recovery, means of the reciprocal scores for 
the last trial of Day 1 of extinction were 
compared with means for the first trial of 
Day 2 of extinction. 

Finally, the frequency with which the 15 
Ss of Group F + W consumed food before 
water in the goalbox during the 50 training 


Running speeds during acquisition. 


trials was recorded. This ‘“‘preference”’ 


measure was suggested when it was noted 
during taming that, under the experimental 
conditions used, Ss showed less hesitancy in 
consuming dry food than in consuming water 


RESULTS 
Conditioning 


The data based on 
group means for blocks of five trials 
are shown in Fig. 1. An analysis 
of variance based on means for the 
last five conditioning trials yielded an 
F of 20.81 (df = 3/56, P < .001). 
Table 1 shows the results of ¢ tests 
on the means for the last five condi- 
tioning trials. It is evident from the 
results of this analysis that Groups 
F/W and F+W ran significantly 
faster than Groups F and W but did 
not differ from each other in running 
speed. 


acquisition 


An analysis of variance was also 
performed for the entire 50 acquisition 
trials. The resulting F of 25.50 was 
again significant (df = 3/56, P<.001). 
The ¢ tests between means based on 
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rABLE 1 
RESULTS OF ¢t TESTS OF DIFFERENCES BE- 
TWEEN Group MEAN SPEEDS FOR 
rHE LAst FivE CONDITIONING 
[RIALS 


Group 
r+ W 
: , 2.04* 
W 50. 6.68** 
F+W 
F/W 


Note.—The within error term from the analysis of 
variance (182.88) was used to compute ¢. 
< .05, df = 56. 


50 trials showed the same group 
trends as those based on the last five 
acquisition trials with the exception 
that Group F was not significantly 
different from Group F/W. Thus, 
although Fig. 1 suggests that Group 
F + W ran faster than Group F/W 
during training, the difference be- 
tween these groups over the 50 


acquisition trials was not significant. 
Finally, a ¢ test between the com- 


\. WUNDERLICH 


bined mean of the groups that re- 
ceived both types of rewards, whether 
successively (Group F/W) or simul- 
taneously (Group F + W), and the 
combined mean of groups that re- 
ceived a single reward (Groups F and 
W) resulted in a ¢ of 6.82 (df = 56, 
P < .001). 
Extinction 

The extinction curves for Day 1 and 
Day 2 are shown in Fig. 2. It is 
apparent that on Day 1 Group F/W 
ran much faster during extinction 
than any of the other groups, Group 
F + W ran next fastest followed by 
Groups F and W. An analysis of 
variance of means based on the 10 
trials of Day 1 yielded an F of 23.19 
(df = 3/56, P < .001). The results 
of ¢ tests on differences between these 
means are shown in Table 2. All 
means are significantly different from 
each other. 

An F of 10.45 (df=3/55, P<.001) 
resulted from an analysis of covari- 
ance based on the last five condition- 
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rABLE 2 


RESULTS OF ¢ TESTS BETWEEN Group Dir- 
FERENCES IN MEAN SPEEDS DURING 
THE First 10 EXTINCTION 
TRIALS 


Group | Mean 


W 


2.09* 


Note.—The within error term from the analysis of 
variance (188.77) was used to compute ¢. 

*P <.05,df = 

=P <.0i1,df = 


56. 
56. 

ing trials and the first 10 extinction 
trials. This F may be interpreted as 
showing an overall between-group 
difference after accounting for the 
differences between groups at the end 
of conditioning. 

Table 3 presents the means and the 
t's between group means based on 
the first 10 extinction trials after the 
means have been adjusted by the 
covariance technique. As_ before, 
Group F/W shows significantly faster 
running during extinction than any 
of the other groups. On the other 
hand, the greater resistance to extinc- 
tion of Groups F and F + W, as com- 
pared with Group W, that was found 


TABLE 3 


RESULTS OF ¢ TESTS BETWEEN ADJUSTED 
Group DIFFERENCES IN MEAN 
SPEEDS DURING THE FIRST 
10 EXTINCTION TRIALS 


Group | justed 


Mean 


0.40 


~Adjusted standard errors of the mean differ- 
ence were used to compute ¢. 
*P <.001, df = 55. 
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with the unadjusted means is no 
longer evident. 

An analysis of variance of the 
second 10 extinction trials resulted 
in an F of 4.86 (df = 3/56, P < .01). 
Subsequent ¢ tests again showed that 
Group F/W was more resistant to 
extinction than the other groups 
(P < .05). None of the other dif- 
ferences were significant, however. 
A covariance analysis yielded an F of 
3.62 (df = 3/55, P < .05) and ¢ tests 
of the differences between adjusted 
means showed only that Group F/W 
was significantly more resistant to 
extinction than Group F + W. The 
other comparisons were not significant. 


Spontaneous Recovery 


As a measure of spontaneous re- 
covery, an analysis of variance was 
performed on the difference between 
running speeds of the last trial of Day 
1 and the first trial of Day 2 of extinc- 
tion. The resulting F of 26.22 was 
significant (df = 1/118, P < .001). 
Only Groups F and F + W showed 
significant spontaneousrecovery. The 
t for Group F + W was 5.02 (df = 118, 
P < .001); for Group F the ¢ was 3.98 
(df = 118, P < .001). 


Reward Preference 


A t of 4.57 (P < .001) was found 
between the percentage of trials that 
Ss of Group F + W consumed food 
before water (82%) and chance (50%). 


DISCUSSION 


The principal finding of this experi- 
ment is that the random distribution of 
two types of rewards from trial to trial 
during training resulted in the clearest 
evidence of a GCR as measured by re- 
sistance to extinction. Thus, Group 
F/W ran much faster during extinction 
than either Group F + W, Group F, or 
Group W. Simply using the two re- 
wards in the goalbox on each training 
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trial, under the present conditions, was 
not sufficient to give any clear indication 
of a GCR. Thus, Group F + W was 
running at about the same speed as 
Groups F and W at the end of 
extinction day (cf. Fig. 2) and was not 
significantly different from these groups 
after extinction means were adjusted 
by the analysis of covariance. 

These findings can be explained if it 
is first noted that the food reward used 
in the experiment was, operationally, a 
much ‘‘stronger”’ reinforcer than the water 
reward. This conclusion follows from two 
things: (a) Group F ran much faster than 
Group W during training, and (b) Group 
F + W reliably ‘preferred’ food over 
water as determined by first reward selec- 
tionin the goalbox.* Further, arecent find- 
ing by Logan, Beier, and Kincaid (1956) 
showed that aperiodic reinforcement of 
a running response with nine food pellets 
on half the training trials and one pellet 
on the other half produced extinction 
performance comparable to that pro- 
duced by giving partial reinforcement 
with nine pellets on half the trials and 
nothing on the other half. Both these 
groups were more resistant to extinction 
than a group given nine food pellets on 
all training trials. On the assumption 
that nine food pellets are a “‘stronger’’ 
reinforcer than one food pellet (Logan, 
Beier, & Ellis, 1955), these data suggest 
that varied strength of reinforcements 
during training acts like partial reinforce- 
ment inincreasing resistance to extinction. 

Generalizing from the data of Logan 
et al. (1955) to the present findings, it 
would be predicted that Group F/W 
would show greater resistance to extinc- 
tion than the other three groups because 


each 


* Related to the preference of food over 
water was a finding that has practical sig- 
nificance. Under the drive conditions in this 
experiment, it was found that Ss that are 
both hungry and thirsty will accept dry food 
pellets as reinforcement. Contrary to previ- 
ous reports, Ss in this experiment showed no 
hesitation in accepting dry food pellets in 
either the reward situation or in the ac- 
ceptance of food before water in the daily 
feeding. The small size of the food reward 
may have been a factor here. 
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it was trained with an operationally 
“strong” reinforcer on half the training 
trials and an operationally ‘‘weak”’ rein- 
forcer on the other half. Groups F + W, 
F, and W, on the other hand, received a 
constant ‘‘strength’’ of reinforcement 
from trial to trial, and it would be 
predicted that they would extinguish 
relatively rapidly, as they did. 

It must not be assumed that, in the 
present experiment, simple amounts or 
weights of food and water rewards 
determined reinforcement strength. Al- 
though, for example, Group F + W 
received only half the weight of food 
that Group F/W received on any one 
trial (both groups, however, received 
the same total weight over all 50 trials), 
the running speeds during training were 
not significantly different. Similarly, 
Groups F and W ran much slower than 
Group F + W although the latter group 
received only half as much food and 
water on any training trial as the former 
groups. Clearly some effect due to the 
combination of food and water as a 
reward, which was independent of their 
simple weights, accounted for the faster 
training running speed of Groups F + W 
and F/W. 

Finally, the data from spontaneous 
recovery did not give any consistent 
indication of a GCR. These data can 
be accounted for by noting that the 
two groups which did show significant 
spontaneous recovery (Groups F + W 
and F) were the only ones to show any 
appreciable loss in running speed over 
the 10 trials of Day 1 of extinction. 
Consequently, they were the only ones 
which could demonstrate much of an 
increase in running speed on the first 
trial of Day 2 of extinction. 


SUMMARY 


Four groups of rats under hunger and 
thirst motivation were used to investigate 
Skinner’s notion of a generalized conditioned 
reinforcer (GCR). The groups differed only 
in the type of reward (food, water, or both) 
and the distribution of the types of reward 
that they received during conditioning of a 
running response in a straight alley. 
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rhe results showed evidence of a GCR 
in that a group that received either food or 
water reinforcement randomly from trial 
to trial demonstrated greater resistance to 
extinction than groups receiving both food and 
water, only food, or only water from trial to 
trial. 

In addition to the 
distribution of the 


importance of the 
reinforcement 
in the formation of a GCR, it was suggested 


types of 
that differences in reinforcement “strength”’ 
may serve as an additional variable of some 
importance in developing a GCR 
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The results of simple probability 
learning experiments, such as light 
guessing in a two-choice situation, 
generally support the matching rule of 
stimulus probability and S’s response 
probability (Estes, 1959). These 
findings are in conflict with the results 
of a more complex experiment where 
six probabilistic verbal associations 
were concomitantly learned. In the 
latter experiment, it was found that 
probabilities from .66 to .90 yielded 
systematic overresponding of the re- 
spective stimulus probability, and the 
probabilities of .10 to .40 yielded cor- 
responding systematic underrespond- 


ing (Voss, Thompson, & Keegan, 
1959). One suggestion of the latter 


paper was that the response tendencies 
are related to the probability dis- 
criminations of Ss. The present paper 
reports two experiments designed to 
investigate probabilistic psychophys- 
ical judgments and suggests a relation 
of probabilistic discrimination and 
probability learning. 

The psychophysical investigation 
of thresholds typically involves pres- 
entation of a stimulus event and 
subsequent judgment by S_ with 
respect to some aspect of the stimulus 
presentation. The present experi- 
ments extend psychophysical investi- 
gation to judgments of probabilistic 
events. The stimulus was a sequence 
of 50 light presentations, where one 
and only one of two lights was 
presented on each presentation. Stim- 

1 The research reported in this article was 
supported by Grant M-3531 of the National 


Institute of Mental Health. 
2 Now at the University of Wisconsin. 


ulus probability was manipulated by 
variation in the relative number of 
each of the two lights in the sequence. 
Judgments made at the end of each 
series of 50 lights regarding the esti- 
mated occurrence of one light pro- 
vided a response measure of estimated 
probability. 


EXPERIMENT 1 
Method 


Two 7-w. light bulbs, labeled A and B, 
were mounted at an equal height on a upright 
board. The light sequences were presented 
by E’s manipulations of switches that were 
behind and below the stimulus light board 
and out of Ss’ view. A low volume electronic 
metronome was used to maintain a 1 light/ 
sec rate. 

Eighteen groups of Ss were randomly 
assigned to a 9 X 2 factorial design. The 
nine probability conditions (PC) employed 
were: .10, .20, .30, .40, .50, .60, .70, .80, and 
.90. The light judged, A or B, was orthogonal 
to PC. The N/cell was 10 and groups of 10 
were run whenever possible. 

The stimulus sequences were presented by 
the method of single stimuli (Woodworth & 
Schlosberg, 1954) with the modification that 
each stimulus occurrence was a sequence of 
50 light presentations. Use of this number 
of light presentations enabled the probability 
of one light to vary from 0 to 1.00 in steps of 
.02. Eleven probability values of randomly 
obtained light sequences were used at each 
PC. These values centered about the PC 
value in steps of .02; e.g., the probability 
values of the light judged at the .60 PC were 
50, 52, 3S, .... £6, 6, 7% Upen 
presentation of each light sequence, Ss judged 
whether the respective light occurred ‘“‘more 
than” or “‘less than”’ the particular PC value. 

At each PC, four random series of the 11 
probability values were employed. Thus 
each S made 44 judgments (11 sequences at 
the respective PC X 4 random series of these 
values). 
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JUDGMENTS OF STIMULUS SEQUENCES 


The Ss were students at the College of 
Wooster and were instructed as follows: 


rhis is an experiment in your ability to 
make accurate judgments regarding the 
frequency of occurrence of lights. You are 
to make the judgments on the paper 
handed out to you. Lights A and B will 
come on individually a given number of 
times. At the end of the sequence, I will 
say “Judge.’’ This means that you are to 
check the word that indicates whether you 
think Light A B came on more than or less 
than per cent of the times | have presented 
the light. Soif you feel Light A B came on 
more than —— per cent, check ‘‘more;”’ if 
you feel it came on less, check “‘less."". You 
are to check “‘more”’ or “‘less’’ but not both. 
Please make no effort to count the lights. 
You are to observe the lights and write 
your judgment when I say “Judge.” 
When you have checked the word, turn 
that page over and do not look at it again. 
Are there any questions? 


Results 


Figure 1 presents the percentage of 
“more than’’ responses as a function 
of PCs for the four series. Since one- 
half of the sequences judged at each 
PC were above and one-half below 
the respective PC, the Series 1 data 
indicate a systematic tendency to 
overestimate the sequences at PCs 
greater than .50 and underestimate 
the sequences at PCs below .50. On 
the subsequent three series, the judg- 
ments of the .60, .70, and .80 PC 
sequences remained above the 50% 
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Fic. 1. Percentage of “more than’’ re- 
sponses as a function of probability condition 
with Series as parameter. 
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response level, and the sequences of 
the .10, .20, and .30 PCs yielded 
responses at approximately the 50% 
level. 

An analysis of variance performed 
on the data of each PC revealed a 
significant Series source of variation 
at each PC except .30 and .50. The 
Light Judged (A or B) and Light 
Judged X Series interactions are not 
significant for any PC. 

Since the data for most PCs indi- 
cated that the significant series effect 
is likely due to differences between 
Series 1 and the remaining three 
series, the data of Series 2, 3, and 4 
were pooled for further analysis. 

The Series 1 data were analyzed 
by a test of skew-symmetry in order 


to determine if the PC data 


were 
symmetric about the .50 PC. The 
frequency of “less than’’ responses 


was employed for the .10—.40 PCs, 
whereas the ‘‘more than’’ response 
frequency was used for the .60-.90 
PCs. The analysis consisted of a 
2 X 4 factorial design with the respec- 
tive sources of variation of Above vs. 
Below the .50 PC and Amount Above 
and Below the .50 PC. 
The Above vs. Below 
variation is significant 
df = 1/152). 


source of 
(F = 6.56, 
This finding indicates 
a significantly greater number of 
“more than” responses at PC above 
.50 than “‘less than” responses at PC 
below .50. The same analysis per- 
formed on the pooled data of Series 
2, 3, and 4 again revealed a signifi- 
cantly greater number of ‘“‘more than” 
responses at the .60-.90 PCs than 
the “less than”’ responses at the .10 
40 PCs (F = 13.59, df = 1/152). 
The Amount and Amount X Above 
vs. Below sources of variation are not 
significant in either the Series 1 or 
the pooled series 2—4 analyses. 


Figure 2 presents psychophysical 


curves for the data of Series 1 and 
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the pooled series 2-4. The .60-.90 
PC pooled Series 2—4 data indicate a 
relatively large frequency of “‘more 
than” responses over all stimulus 
sequences, but an especially high fre- 
quency at stimulus sequences below 
the particular PC value (e.g., .70 in 
the .70 PC). The Series 1 data of the 
.60-.90 PCs indicate the same ten- 
dencies, with a _ generally greater 
response frequency at sequences below 
the respective PC values. 

The .10-.40 PC Series 1 data indi- 
cate a corresponding tendency in that 
sequences above the PC value se- 
quences tend to yield relatively fewer 
“more than’’ responses. The pooled 
Series 2-4 data for the .10-.40 PCs 
indicate, however, an increase in 


af] 
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“more than’’ responses at sequences 
above the PC value sequences. 

Figure 2 also reveals that the se- 
quences at the particular PC values 
of .60-.90 were generally overesti- 
mated in both Series 1 and the pooled 
Series 2-4. ‘The initial tendency in 
Series 1 of the .10—.40 at the PC value 
sequences was to underestimate; but 
in the pooled Series 2-4, judgments 
of sequences at the particular PC 
values varied. 

Table 1 presents the PSE and DL 
values for Series 1 and the pooled 
Series 2-4. The PSE values for 
Series 1 of .60-.90 PCs reflect the 
asymmetric results of Fig. 2. Thus, 
since the responses of the stimulus 
sequences below the respective PC 
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Fic. 2. Percentage of ‘more than’’ responses for the sequences of each PC with 
Series 1 and pooled Series 2—4 as parameter. 
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TABLE 1 


PSE anp DL VALUES FOR THI 


Experiment 1 


Pooled Series 2 


PSI 


896 
785 
688 
589 
502 
400 
307 
187 
096 


sequences received a relatively large 
number of ‘‘more than’’ judgments, 
the 50% response point is found on 
the lower side of the PC value. Like- 
wise, the PSE values of Series 1 for 
the .10—.40 PCs indicate a PSE above 
the PC values beeause of a relatively 
low level of ‘‘more than”’ judgments 
above the particular PC values. 
The pooled Series 2-4 PSE values 
generally remained on the same side 
of each PC as for the Series 1 .60—.90 
PC values, but for then.10—.40 PCs, 
less consistency occurred. 

The DL values generally indicate 
a reduction in variability on the 
pooled Series 2-4 compared to Series 1. 
Also, the restriction of variation in 
sequences near 0 and 1.0 is reflected 
by the small DL of the .10 and .90 
PC conditions. 


Discussion 


to overestimate 
.60-.90 PCs 
is especially associated with overestima- 


The initial tendency 


the stimulus sequences at 
tion of sequences below each respective 
PC value. It should be noted that if 
all sequences at the .60-.90 PCs are 
overestimated, there is a greater possi- 
bility for an “more than” 
below the PC 
underestimation of 


increase in 
responses at 


The 


sequences 


value. initial 


PROBABILITY CONDITIONS OF Exp 


1 AND 2 
Experiment 2 


Standard 


Nonstandard 


the .10—.40 PCs is associated with fewer 
“‘more than”’ the stimulus 
sequences above the respective PC. 
The Series 1 PSE values as well as the 
judgments made at the particular PC 
value sequences are in general agreement 


responses at 


with the asymmetry of responses at the 
respective PC. 

The pooled Series 2—4 data indicate 
that the initial tendencies diminish, al- 
though the .60-.90 
level 


overestimation re- 
persisted more than the 
.10-.40 underestimation tendency. Fig 
ure 2 reflects that 
the asymmetry of responses was main- 
tained at the sequences of the .60—.90 
pooled Series 2-4, whereas the .10-.40 
pooled Series 2—4 data indicate more 
symmetric curves. 


sponse 


these differences in 


The four particular series at each PC 
and practice over the series are con- 
founded in the significant series effect. 
The consistency over PC of the initial 
overestimation of .60-.90 PCs and the 
initial underestimation of .10—-.40 PCs 
strongly suggests that although a certain 
degree of variability is associated with 
the intraseries differences, the Series 1 
and the pooled Series 2—4 results are 
due largely toinitial estimation tendencies. 


EXPERIMENT 2 
Method 


Experiment 2 was designed to determine 
the generality of the findings of Exp. 1 under 
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conditions of greater stimulus control and 
more standardized conditions. In particular, 
practice was extended by use of a9 X 9 latin 
square design in which each group of Ss 
served in all nine PCs. In addition, a stand- 
ard stimulus was employed for one-half of the 
Ss. The standard involved presentation of 
the particular PC value to be judged by 
showing Ss light sequences of the PC value. 

The procedure varied from Exp. 1 in that 
one randomly permuted 9 X 9 latin square 
was used for both the standard (S) and non- 
standard (non-S) conditions. The Latin 
letters were the PC values of Exp. 1. Eight 
Ss served in each row of the square. 

Random sequences of 50 lights were pre- 
sented at each PC value, as in Exp. 1, with 
the modification that instead of 11 probability 
values at each PC, 9 probability values were 
used which centered about the respective 
PC value in steps of .02. Furthermore, since 
each group served at each PC, the 9 prob- 
ability values at each PC were permuted for 
the 9 groups (rows) by use of a 9 X 9 latin 
square. Thus, the 9 stimulus probability 
values within each PC were virtually counter- 
balanced over the 9 groups. 

The same latin square was employed for 
the S group as for the non-S group. In addi- 
tion, a standard sequence of lights, which 
always contained the PC value to be judged, 
was presented before each sequence of lights 
to be judged. In order to eliminate the 
possible effect of a particular standard se- 
quence, nine such standard sequences were 
employed at each PC and constituted the 
Greek letters of a 9 X 9 greco-latin square. 

The presentation rate and apparatus were 
the same as those of Exp. 1. Instructions 
were modified only to take into account the 
use of a standard and the changing PC 
conditions. 


Results 


Figure 3 presents the first PC 
judged by all groups in Exp. 1 and 2. 
Experiment 2 data of Fig. 3 were 
analyzed by the same test of skew- 
symmetry used in Exp. 1, with the 
modification that the S vs. non-S 
source of variation was included 
(2 X 2 X 4 factorial analysis). The 
significant effect of the S vs. non-S 
conditions (F = 11.0, df = 1/112), 
in conjunction with the lack of any 
significant S vs. non-S interaction, 
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PROBABILITY CONDITION 
Fic. 3. Percentage of ‘“‘more than” re- 
sponses as a function of probability condition 
for the first judged probability condition of 
the standard and nonstandard conditions of 
Exp. 2 and for Exp. 1. 


indicated that the non-S group made 
a greater number of responses than 
the S group (“‘more than” responses 
for .60-.90 PCs and “less than’”’ for 
10-40 PCs). The only other sig- 
nificant source of variation is Amount 
of deviation from the .50 PC (F=3.5, 
df = 3/112). However, this result 
only indicates that the .70-.30 PC 
conditions deviated more from the 
50% level than the other three 
virtually equal Amount conditions. 


The latin square analysis performed 
on the more than response frequency 
data revealed a significant PC source 
of variation (F = 2.34, df = 8/1408), 
and a lack of significance of the S vs. 


non-S and S vs. non-S XK PC sources 
of variation. The only significant 
source of variation in addition to PC 
is PC sequence (F = 4.91, df = 8/8). 
Inspection of the PC sequences of the 
latin square revealed that occurrence 
of the .70, .80, and .90 PCs early in 
the sequence produced a_ greater 
number of ‘‘more than’ responses 
over the PC sequence, whereas occur- 
rence of the .70, .80, and .90 PCs later 
in the sequence yielded fewer ‘‘more 
than” responses over the sequence. 
The psychophysical curves for the 
S and non-S conditions were similar 
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to the pooled Series 2—4 curves of 
Fig. 2. The major difference in the 
S and non-S curves is that the S 
condition apparently produced more 
uniform curves with more accurate 
extreme judgments than the non-S 
condition. 

Table 1 presents the PSE and DL 
values for the S and non-S groups. 
The PSE values again generally 
occurred on the same side of the 
respective PC as found in Exp. 1. 
In addition, a slight tendency occurred 
for the PSE to be higher in the non-S 
condition. The DL results indi- 
cate that, especially with lower PCs, 
the standard reduced variability of 
judgments. 


Discussion 


The results of both experiments indi- 
cated an initial tendency to overestimate 
PC sequences and_ under- 
estimate PC sequences below .50. Fur- 
thermore, Exp. 1 results indicated that 
the former tendencies were greater than 
the latter. The lack of Above-Below 
differences in Exp. 2 suggests that the 
relatively greater estimation of .60—.90 
PC may be a function of the particular 


above .50 


‘ initial stimulus sequences employed. 
Figure 1 and the significant Series effect 
of Exp. 1 suggest that with practice, 


the initial response 
proach the 50% 
.60—-.80 PCs, remained con- 
sistently above the 50% level and 
significantly different from the .10—.40 
PCs underresponding. The practice 
effect (Ordinal Position) of Exp. 2 is 
confounded with PC sequences. More- 
over, since sequence is significant, Exp. 2 
suggests that the practice effects vary 
with PC sequences previously judged. 
Therefore, because of the use of only 
four series in Exp. 1 and because of the 
PC sequential effects in Exp. 2, no 
conclusions may be reached regarding 
long term practice effects. In particular, 
the amount of necessary to 
reduce the overestimation judgments of 
the .60-.80 PCs of Exp. 1 to the 50% 


levels tend to ap- 
level. The 


response 
however, 


practice 
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level, if 
cannot be specified. 

The use of a standard stimulus ap- 
parently has the effect of making more 
accurate judgments early in practice and 
at the extreme sequence values of each 
PC. Moreover, the lack of asignificant 
difference between the S and non-S 
conditions in the latin square analysis 
indicates that the differences of S and 
non-S are essentially eliminated by 
practice. 

It should be noted that the use of 50 
light presentations as a stimulus event 
implicitly contains a sequential factor 
not found in 


response such effects occur, 


stimulus events. 
The results of the present experiments 
therefore may be dependent toa presently 
unknown degree on the use of particular 
sequences of 50 lights. 

One 


single 


methodological problem of the 
present experiments is counting. At a 
1 light/sec rate, there is the possibility 
of counting. However, Ss were not 
provided with any suggestion or material 
to compute percentages, and judgments 
were made in an interval of only 5 sec. 
Moreover, it would be suspected that 
if S counted, he should have each judg- 
ment correct. Inspection of the data 
revealed that no S consistently or even 
often made all correct judgments at any 
PC or in any series. 

The initial tendencies of the present 
results are somewhat in agreement with 
other studies on probability prediction 
and estimation. Neimark and Shuford 
(1959) found that in a two-choice situa- 
tion, Ss instructed to estimate and 
predict the occurrence of events esti- 
mated accurately, but overpredicted a 
compound event which involved simul- 
taneous presentation of two samples of 
matrix elements. The present experi- 
ments however involved judgments after 
a sequence of stimuli, whereas Neimark 
and Shuford (1959) employed judgments 
after each light presentation, and Shuford 
(1959) studied judgments of simultaneous 
stimuli. 

The discrepancy 
association situation 


between the 
(Estes, 1959) and 
the more complex six-association situa- 
tion (Voss et al., 1959) may be due to 


one 
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the aspects of the stimuli to which Ss 
respond. In the case, 
Ss likely respond primarily to sequential 
characteristics such as run lengths, etc. 
(Anderson, 1960; Nicks, 1959), whereas 
in the six-association paradigm, inter- 
ference is produced between the stimuli 
and responses, and Ss respond not to 
sequential characteristics, but primarily 
to frequencies of the events. Responding 
thus produces results 
similar to the present experiments in 
that the probabilities above .50 yield 
overresponding and the _ probabilities 
below .50 produce underresponding. 

Finally, the present results are relevant 
to decision making in that they present 
a functional relation of subjective prob- 
ability as a function of objective prob- 
ability in a rewardless, riskless situation. 
This function provides a baseline for 
comparison of the relation of subjective 
probability and probability 
under conditions of and risk 
(Edwards, 1954). 


one-association 


to frequencies 


objective 
reward 


SUMMARY 


Two experiments were reported on the 
ability of Ss to judge the probability of 
occurrence of a given light. 
lights were presented where one of two lights 
was shown on each occurrence. Probability 
of a particular light in a 50-light sequence was 
from .10 to .90 in .10 interval steps. At each 
probability value, stimulus sequences were 


Sequences of 50 


presented in random order in probability steps 
of .02, centered about the particular prob 
ability value. 

The results indicated an initial systematic 
tendency of Ss to overestimate probabilities 
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from .60 to .90 and underestimate probabili- 
ties from .10 to .40. Experiment 1 indicated 
that the former tendency was greater than 
the latter, both initially and after practice. 
Experiment 2 did not provide information 
regarding practice except that sequence of PC 
employed is significant. Initial judgments 
were more accurate when the standard was 
used in Exp. 2. The results are discussed in 
relation to probability learning and decision 
making. 
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